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This dissertation is a study on the design and analysis of novel, op- 
timal routing and rate control algorithms in wireless, mobile communication 
networks. Congestion control and routing algorithms upto now have been de- 
signed and optimized for wired or wireless mesh networks. In those networks, 
optimal algorithms (optimal in the sense that either the throughput is maxi- 
mized or delay is minimized, or the network operation cost is minimized) can 
be engineered based on the classic time scale decomposition assumption that 
the dynamics of the network are either fast enough so that these algorithms 
essentially see the average or slow enough that any changes can be tracked 
to allow the algorithms to adapt over time. However, as technological ad- 
vancements enable integration of ever more mobile nodes into communication 
networks, any rate control or routing algorithms based, for example, on aver- 
aging out the capacity of the wireless mobile link or tracking the instantaneous 
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capacity will perform poorly. The common element in our solution to engineer- 
ing efficient routing and rate control algorithms for mobile wireless networks 
is to make the wireless mobile links seem as if they are wired or wireless links 
to all but few nodes that directly see the mobile links (either the mobiles or 
nodes that can transmit to or receive from the mobiles) through an appropri- 
ate use of queuing structures at these selected nodes. This approach allows 
us to design end-to-end rate control or routing algorithms for wireless mobile 
networks so that neither averaging nor instantaneous tracking is necessary, as 
we have done in the following three networks. 

A network where we can easily demonstrate the poor performance of a 
rate control algorithm based on either averaging or tracking is a simple wireless 
downlink network where a mobile node moves but stays within the coverage 
cell of a single base station. In such a scenario, the time scale of the varia- 
tions of the quality of the wireless channel between the mobile user and the 
base station can be such that the TCP-like congestion control algorithm at 
the source can not track the variation and is therefore unable to adjust the 
instantaneous coding rate at which the data stream can be encoded, i.e., the 
channel variation time scale is matched to the TCP round trip time scale. On 
the other hand, setting the coding rate for the average case will still result 
in low throughput due to the high sensitivity of the TCP rate control algo- 
rithm to packet loss and the fact that below average channel conditions occur 
frequently. In this dissertation, we will propose modifications to the TCP 
congestion control algorithm for this simple wireless mobile downlink network 
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that will improve the throughput without the need for any tracking of the 
wireless channel. 

Intermittently connected network (ICN) is another network where the 
classic assumption of time scale decomposition is no longer relevant. An in- 
termittently connected network is composed of multiple clusters of nodes that 
are geographically separated. Each cluster is connected wirelessly internally, 
but inter-cluster communication between two nodes in different clusters must 
rely on mobile carrier nodes to transport data between clusters. For instance, 
a mobile would make contact with a cluster and pick up data from that clus- 
ter, then move to a different cluster and drop off data into the second cluster. 
On contact, a large amount of data can be transferred between a cluster and 
a mobile, but the time duration between successive mobile-cluster contacts 
can be relatively long. In this network, an inter-cluster rate controller based 
on instantaneously tracking the mobile-cluster contacts can lead to under uti- 
lization of the network resources; if it is based on using long term average 
achievable rate of the mobile-cluster contacts, this can lead to large buffer 
requirements within the clusters. We will design and analyze throughput opti- 
mal routing and rate control algorithm for ICNs with minimum delay based on 
a back-pressure algorithm that is neither based on averaging out or tracking 
the contacts. 

The last type of network we study is networks with stationary nodes 
that are far apart from each other that rely on mobile nodes to communicate 
with each other. Each mobile transport node can be on one of several fixed 
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routes, and these mobiles drop off or pick up data to and from the stationaries 
that are on that route. Each route has an associated cost that much be paid 
by the mobiles to be on (a longer route would have larger cost since it would 
require the mobile to expend more fuel) and stationaries pay different costs 
to have a packet picked up by the mobiles on different routes. The challenge 
in this type of network is to design a distributed route selection algorithm 
for the mobiles and for the stationaries to stabilize the network and minimize 
the total network operation cost. The sum cost minimization algorithm based 
on average source rates and mobility movement pattern would require global 
knowledge of the rates and movement pattern available at all stationaries and 
mobiles, rendering such algorithm centralized and weak in the presence of net- 
work disruptions. Algorithms based on instantaneous contact, on the contrary, 
would make them impractical as the mobile-stationary contacts are extremely 
short and infrequent. 
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Chapter 1 



Introduction 



Mobile communication networks have one essential problem that dif- 
ferentiates them from more traditional networks like the Internet or WiFi 
mesh networks. Communication algorithms for those networks are engineered 
with the assumption that the network dynamics are either slow enough to be 
tracked or so fast that the algorithms would essentially see the time average. 
For example, the round-trip time between any two computers in the Internet 
is now less than 50msecs. This enables an end-to-end congestion controller like 
the Transport Control Protocol (TCP) to detect network congestion quickly 
and adjust the transmission rate in response. In this case, the TCP algorithm 
tracks the congestion inside the network and adjust the transmission rate ac- 
cordingly. On the other hand, in the based networks, the wireless channel 
between a base station and a mobile user fluctuates so rapidly over one data 
frame transmission time so that it can not be tracked; however, the average 
bit error rate (BER) in each frame is relatively static over multiple frames, 
and this allows channel coding algorithm with fixed coding rate to be used. 

The classic time-scale separation assumption of trackability or averag- 
ing no longer holds in mobile networks. Consider a communication network 
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used by soldiers deployed in remote terrains. These soldiers have organized 
themselves into multiple, geographically separated clusters, and are equipped 
with wireless communication devices so that they may communicate with oth- 
ers in the same cluster. However, because of the geographical separation and 
limited range of the wireless transceivers, they must rely on mobile data trans- 
porters to carry data between clusters. When a mobile data transport comes 
into contact with a cluster, it can pick up a large amount of data per contact 
with that cluster; it can then move to the destination cluster, and drop off 
that data into that cluster. If a rate control algorithm at the inter-cluster traf- 
fic source (inter-cluster traffic has the source and the destination in different 
clusters) adjusts the source rate by tracking the mobile-cluster contacts and in 
essence trying to track the instantaneous available inter-cluster communication 
data rate, then even though there is a temporary large increase in the available 
inter-cluster rate when the contact is made, there might not be enough rate 
available within the cluster through which the inter-cluster flow must travel; 
in effect, only small portion of the large instantaneous inter-cluster rate can 
be used. 

Alternatively, if the rate control algorithm tries to adjust the rates 
based on the "average" inter-cluster rates, then either there will be a large 
queue build up at every node (as we will demonstrate in the later chapter) or 
the algorithm runs into the problem of finding out what the "average" rate is, 
which makes it weak in the presence of changes and failures in the network. 

In this dissertation, we design and analyze novel routing and rate con- 
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trol algorithms in mobile communication networks where tracking or averaging 
the network dynamics is impossible or leads to inefficient performance. To- 
wards that end, we examine three types of networks; we briefly introduce each 
type and highlight our contributions below. 

1.1 Problem Statements and Contributions 
1.1.1 TCP for Wireless Downlink Networks 

It is well-known that TCP connections perform poorly over wireless 
links due to channel fading. To combat this, techniques have been proposed 
where channel quality feedback is sent to the source, and the source utilizes 
coding techniques to adapt to the channel state. However, the round-trip 
timescales quite often are mismatched to the channel-change timescale, thus 
rendering these techniques to be ineffective in this regime. (By the time the 
feedback reaches the source, the channel state has changed.) 

In this dissertation, we propose a source coding technique that when 
combined with a queuing strategy at the wireless router, eliminates the need 
for channel quality feedback to the source. We show that in a multi-path 
environment (e.g., the mobile is multi- homed to different wireless networks), 
the proposed scheme enables statistical multiplexing of resources, and thus 
increases TCP throughput dramatically. 
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1.1.2 Time-Scale Decoupled Routing and Rate Control in Inter- 
mittently Connected Networks 

The second type of network we study in this dissertation is an inter- 
mittently connected network (ICN) composed of multiple clusters of wireless 
nodes. Within each cluster, nodes can communicate directly using the wire- 
less links; however, these clusters are far away from each other such that di- 
rect communication between the clusters is impossible except through mobile 
contact nodes. These mobile contact nodes are data carriers that shuffle be- 
tween clusters and transport data from the source to the destination clusters. 
Our dissertation here focuses on a queue-based cross-layer technique known as 
back-pressure algorithm. The algorithm is known to be throughput optimal, 
as well as resilient to disruptions in the network, making it an ideal candidate 
communication protocol for our intermittently connected network. 

We design a back-pressure routing/rate control algorithm for ICNs. 
Though it is throughput optimal, the original back-pressure algorithm has 
several drawbacks when used in ICNs, including long end-to-end delays, large 
number of potential queues needed, and loss in throughput due to intermit- 
tency. We present a modified back-pressure algorithm that addresses these 
issues. 

1.1.3 Efficient Data Transport with Mobile Carriers 

For the third type of network, we consider a network of stationary nodes 
that rely on mobile nodes to transport data between them. We assume the mo- 
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bile nodes can control their mobility pattern to respond to data traffic loads, 
as well as satisfy some other secondary objectives, such as surveillance require- 
ments. We study this problem in the framework of cost minimization, and we 
derive a dual iterative algorithm that results in optimal mobility pattern for 
minimizing network wide cost. 

1.2 Organization 

Each subsequent chapter focuses one of the three network types above. 
In Chapter 2, we study the problem of TCP congestion controller in cellular 
networks. We briefly discuss the TCP background and present our modifica- 
tions that will improve the TCP throughput using multi-homing. We then 
present our simulation results. 

Chapter 3 is on rate control and routing in intermittently connectedly 
network using a back-pressure algorithm. We describe the back-pressure al- 
gorithm and highlight the benefits and shortcomings of the algorithm in ICN. 
We then present our solutions and present our experimental results obtained 
from our test bed. 

In Chapter 4, we present our mobility control algorithm that minimizes 
the network wide cost. We present our network model and state our cost 
minimization optimization problem and iterative solution algorithm, followed 
by our experimental results. 

We end this dissertation with a conclusion and a discussion on possible 
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future research topics. 
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Chapter 2 



TCP for Wireless Downlink Networks 

2.1 Introduction 

The Transport Control Protocol (TCP) is the most widely used con- 
gestion control protocol in the Internet. When a router in the Internet is used 
beyond its capacity, its buffer will overflow and start to drop packets. A source 
using TCP will interpret dropped packets ClS db SI gnal of congestion, and it will 
promptly reduce its transmission rate to relieve the congestion. 

TCP was designed and optimized with the assumption that the net- 
works that it was supposed to operate over have highly reliable node-to-node 
links such that dropped packets due to poor link quality are highly unlikely. 
Hence, a dropped packet meant only one thing - congestion. 

However, in wireless networks, TCP has no way of distinguishing con- 
gestion drops and drops due to poor quality wireless channel. A typical wireless 
link is designed with average BER (Bit Error Rate) on the order of 10 -5 , which 
results in an average packet error/drop probability (PER) of 5-10% assuming 
1KB packet. In addition, the average BER (and PER) of a wireless link can 
fluctuate over time, and the rate of fluctuation poses a significant problem for 
TCP. If plain TCP is used over the wireless links without any modifications, 



7 



this considerably reduces TCP's average congestion window size and prevents 
it from enlarging the window size to any significant portion of the ideal size, the 
bandwidth-delay product, resulting in a low utilization rate [6,32,40]. Take 
for example, a plain TCP (Reno) connection made over one-hop wireless link. 
Suppose the bandwidth-delay product is infinite for this connection. However, 
the packet drop probability due to bad wireless channel is d. In Figure 2.1(a), 
we plot the average window size of this TCP connection as d varies from 0.001% 
to 10%. As the figure shows, the average window size (and the throughput) 
decreases substantially as d increases. A similar performance graph is shown 
in Figure 5 of [32]. 



Average Window Size vs. Drop Probability Average Window Size vs. p 
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(a) Average window size of plain TCP con- (b) Average window size of TCP using FEC 
nection made over one wireless link with at fixed coding rate as p varies. Even if the 
packet drop probability d due to fading. As coding rate is sufficient for the average chan- 
d increases, the average window size and the nel drop probability, if the drop probabil- 
throughput drops rapidly. ity changes every RTT, then the through- 

put will still be low. The drop probability 
is equal to 0.05 with probability p, and is 
equal to 0.15 with probability 1 — p. 

Figure 2.1: Performance of TCP over one-hop wireless link 
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Figure 2.2: A multipath TCP-RLC connection made over a well-provisioned 
wired network and M wireless routers. We exploit multi-path wireless channel 
diversity to increase throughput. The dashed lines represent wireless links. 

In this chapter, we address the problem of low TCP throughput in the 
simple topology of TCP senders connected via wireline network to interme- 
diate wireless routers and TCP receivers connected by a wireless channel to 
multiple intermediate routers (see Figure 2.2). An example scenario would 
be a cellular access network (such as UMTS/WiMax) where the cellular base 
station is connected to the wired backbone, and only the link between the base 
station and the mobile user is wireless. Although multi-homing is not currently 
implemented in current cellular networks, with the introduction of femto-cells, 
it is conceivable that in a campus scenario with a number of femto-cells, the 
mobile user may be able to receive downlink data simultaneously on multiple 
links from multiple femto-cells; this motivates the multi-path model in Figure 
2.2. 
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2.1.1 Shortcomings of Existing Solutions 

To combat the adverse nature of the wireless network, multiple solutions 
have been proposed, all involving a separation of time-scales between the rate 
of channel variation and the TCP congestion window evolution. One can break 
the TCP connection between a wired server and a mobile into two components: 
wired and wireless [7]. However, this approach needs a proxy at the wireless 
base-station, and breaks TCP end-to-end semantics. 

By contrast, one could protect TCP (without proxying at the wire- 
less router) from channel-level variations by suitable physical layer schemes. 
Of these, the commonly deployed solution in UMTS /WiMax systems involves 
channel coding, adaptive modulation and/or automatic repeat request (ARQ 
or hybrid ARQ) deployed in a lower layer protocol to deal with packet drops 
resulting from channel variations that are at a much faster rate than the end- 
to-end TCP round-trip time (RTT). However, these schemes could lead to 
variations in the rate provided to the TCP connections, and can lead to sub- 
optimal TCP performance [19]. Papers such as [26] improve TCP performance 
over downlink wireless networks through dynamically adjusting PHY layer pa- 
rameters optimized for TCP. However, such strategies require measurements 
both at the transport layer like TCP sending rate as well as physical layer 
information like channel quality at the cost of increased complexity at the 
cellular base station. 

The alternate solution (see [10, 71, 75]) is to code the data stream at a 
specific forward error coding rate at the application layer so that the decoded 
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TCP data stream can withstand drops due to bad wireless channels. In [75], 
the authors use Reed-Solomon coding at a fixed rate to encode a stream of 
TCP packets in order to deal with random losses. In [71], the authors use 
network coding combined with an ACK scheme found in [72] and TCP- Vegas 
like throughput measurements to adapt TCP over wireless links. However, 
such an approach requires the variation in channel drop rate to be quasi-static 
relative to the time-scale of feeding back this channel drop rate information to 
the source so that the coding rate can be adjusted. 

2.1.2 Motivation 

2.1.2.1 Channel Variation and RTT Have Same Time-Scale 

In many realistic settings, the packet drop rate of the wireless channel 
can change at the time scale of round-trip time of the TCP connection. For 
example, consider a mobile user traveling at 2-5km/h using the current UMTS 
network (carrier frequency ~ 2GHz in the U.S.). This user's wireless channel 
coherence time is roughly around 20-50ms [57], a number well within the range 
of RTT for the Internet. (Coherence time is roughly a measure of how long 
a wireless channel stays constant, and therefore a rough measure of how fast 
the packet drop rate changes.) 

In such scenarios, multiple ARQ requests and link-level ACKs and 
NACKs are unhelpful - they cause retransmission delays and timeouts that 
may adversely affect the RTT estimation and the retransmission time-out 
(RTO) mechanism, and therefore throughput. Moreover, forward error cor- 
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rection coding at a fixed rate (at the TCP source) is not helpful since the drop 
rate at the wireless link is not quasi-static relative to the feedback time scale. 
If the drop rate changes every RTT, the information about the drop rate will 
not reach the TCP sender in time to be useful since by the time the informa- 
tion reaches the sender, the drop rate would have changed. Thus, this mobile 
users wireless downlink channel would be useless to track from the perspective 
of improving his TCP throughput; the channel quality feedback reaches the 
source too late, and is useless by the time the source gets it. 

Take for example a TCP connection made over one wireless link. The 
packet drop probability due to bad channel conditions changes every RTT time 
period. Assume that the drop probability d G {0.05, 0.15}, with the probability 
that d = 0.05 is p and the probability that d = 0.15 is 1 — p. Suppose that 
the TCP flow uses forward error correction coding at the rate of 10%, i.e. for 
every 10 data packets, 1 coded packet is generated. Using this FEC coding 
rate, the TCP data packets can reliably be delivered to the destination if the 
drop probability d is 0.05. However, because the drop probability can be bad 
(d = 0.15) often enough, the TCP throughput will be significantly small. In 
Figure 2.1(b), we plot the average window size of the TCP connection as we 
vary p. (We assumed that the bandwidth-delay product is infinite.) As the 
graph shows, even though we have used FEC coding rate that is sufficient 
for the average drop probability, the throughput is still low because the drop 
probability changes every RTT. 

In summary, coding at fixed rate will not work when the wire- 
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less channel variation and TCP RTT have the same time-scale. 

2.1.2.2 Multiple Path Statistical Multiplexing 

There has been much research into multipath TCP connections. The 
obvious advantage of multipath TCP is that it can balance the load on the 
multiple paths such that paths experiencing temporarily high capacity can 
carry more packets than paths experiencing low capacity. Further, multiple 
TCP connections can be useful for load balancing among multiple wireless 
interfaces - for instance, in a situation where a mobile node is connected 
to a 3G network base-stations as well as a femto-cell base-station. In this 
scenario, one would want to get statistical multiplexing gain among the two 
wireless interfaces, as it is likely that the wireless fading state between the 
two interfaces will differ (e.g., when the 3G interface has a bad channel, the 
femto-cell interface could have a good channel). However to exploit this, a 
naive implementation would require that packets stored at the 3G base-station 
be transferred to the other base-station (femto) through a wired back-haul. 
Clearly, time-scales of RTT over the back-haul and channel variation would 
render this impractical. 

A second issue one faces when running a TCP connection over multiple 
paths is the problem of out-of-order packet delivery, which can cause congestion 
window collapse even if the network has plenty of capacity [80]. [54] gets 
around this problem by delaying and reordering received packets before they 
are passed up to the TCP layer on the receive side. [83] uses duplicate selective 
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ACKs (DSACK) and dynamically changes the duplicate ACK threshold to 
address the out-of-order problem. 

By using random linear coding, our proposed TCP modifications can 
be naturally extended to multiple paths, and we will show that coding + 
TCP enables the network to behave as though packets are virtually shared 
among the different base-stations without the need for a back-haul between 
the various base-stations. This in-turn leads to multiplexing gains. Further, 
coding + TCP can easily deal with out-of-order delivery of packets. 

2.1.3 Other Related Work 

Other coding approaches: Recently, inspired by [1, 37] and others, network 
coding schemes have been used in the context of wireless networks in order 
to improve throughput. [21] and [60] use network coding at intermediate 
nodes and exploit the shared wireless spectrum to improve TCP throughput. 
In our approach, we use random linear coding (RLC) [45] at the end nodes 
to improve TCP throughput, and the intermediate router does not perform 
any coding operations. The concept of using random linear coding for TCP 
over wired networks has appeared recently [13]; however, our work here is for 
hybrid network with the goal of improving TCP throughput over time- varying 
wireless channels. 

TCP window statistics under AQM: There is a considerable body of 
literature [30, 76] on modeling the TCP window process in the presence of 
active queue management (AQM) systems, especially random early detection 
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(RED) [25]. [76] presents a weak limit of the window size process by proving 
a weak convergence of triangular arrays. [5] presents a fluid limit of the TCP 
window process, as the number of concurrent flows sharing a link goes to 
infinity, and the authors show that the deterministic limiting system provides 
a good approximation for the average queue size and total throughput. None 
of the previous works mentioned above treats the situation when the loss rate 
can not be tracked due mismatch between the channel change time-scale and 
the RTT time-scale. 

This chapter shows throughput gains that can be achieved by multi- 
path diversity when TCP is combined with an ACK scheme similar to the one 
in [71] and priority queuing strategy found in [11], plus RLC [45]. 

2.1.4 Main Contributions 

In this work, we employ (i) random linear coding, (ii) priority-based 
queuing at wireless routers and (iii) multi-path routing to demonstrate that 
throughput can be increased significantly for TCP over downlink wireless net- 
works even when channel variations are on the same time-scale as RTT. Our 
theoretical result shows that we can obtain full statistical multiplexing gain from 
multi-path TCP. 

Specifically, our analysis shows that we can achieve TCP throughput 
of O (E [P] C) in multiple path (multi-homing) case with our modifications, in 
the absence of channel quality feedback from the destination to the sender. 
Here, E [P] is the mean probability of successful packet transmission of the 
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time-varying wireless channel; C is the capacity of the wireless router. 

Further, our modifications to TCP, which we call TCP-RLC, present 
an orderwise gain over the performance of plain TCP, which is 0(1), in the 
presence of random packet loss for wired-wireless hybrid networks, where the 
random packet loss rates change at the RTT time scale and cannot be tracked. 

2.2 Analytical Model 

We consider slotted time. Each time slot is equal to round-trip time 
between the senders and the receivers. The TCP-RLC source % maintains a 
congestion window of size Wi(t) for the t-th RTT interval. The congestion 
window is in units of packets; packets are assumed to be of fixed size. In each 
RTT slot, source i wants Wi(t) data packets to be transferred to sink i. We 
model the additive increase, multiplicative decrease (AIMD) evolution of the 
TCP congestion window as follows: 

Wi(t + 1) = l success (^(t) + l) + l drop ^(t)/2l (2.1) 

where the random variable l SUCC0S s=l when all data packets transmitted in 
the t-th RTT interval for destination % have been successfully received at the 
destination; ld rop =l — lsucccss takes the value 1 when either (i) the receiver 
cannot recover one or more data packets corrupted by the packet drop process 
or (ii) a packet is marked by the router due to the presence of an active queue 
manager (AQM). AQM marks a connection with window size W according to 
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the probability given by f(W), and thus the congestion window is deliberately 
halved. 

In our analytical model, we assume that the routers can measure the 
window size W of a TCP flow, and based on the window size, each router can 
mark the flow with probability f(W). If a flow is marked by a router, the 
TCP source halves the TCP window size W(t) and reduces the transmission 
rate by factor of two. 

We let f(W) be the probability that a flow is marked when the window 
size is W, and we let f c ha,n(W) denote the probability that the destination is un- 
able to reconstruct the W data packets due the channel packet drop /corrupt ion 
processes over all paths, which would halve the TCP congestion window. 

The combined effects of the AQM with the marking function f(w) and 
the packet drops from the wireless channels can be encapsulated into f e s(w) 
where 

f eS (w) = 1 - (1 - / eff H) 

= l-(l-f( w ))(l-f chan ( w )) 

= /H+ZchanH-JW/cha^H- (2-2) 

Thus, the TCP congestion window will decrease by half or increase by one 
with probabilities f e s(w) and 1 — f c s(w), respectively, if the window size is w. 

In this chapter, we are interested in finding the average throughput 
under the optimal AQM, i.e. E [W*(£)], where W*(t) is the congestion window 
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under the AQM that maximizes E [W*(£)]. Note that the optimal AQM may 
be no AQM at all, but throughput under the optimal AQM has to be greater 
than that under some arbitrary AQM. The presence of AQM greatly simplifies 
our analysis. We later back our claims with simulation results that used no 
AQM. 

In practical scenarios, the evolution of the congestion window size is 
limited by the sender /receiver buffer size; we will ignore this to simplify our 
analysis. We will also neglect TCP timeouts for the same reason. 

2.2.1 Random Linear Coding 

In each time slot, the source i takes Wi(t) data packets and gener- 
ates rWi(t) (r > 0) coded packets as follows: let each data packet Xik,k = 
1, 2, . . . , Wi{t) be represented as an element of some finite field ¥ q ; choose 
elements a^j G ¥ q uniformly at random and generate a coded packet 

Wi(t) 

Dij ^ ^ Qikj%ik 
k=l 

for j = 1,2,..., rWi(t). The receiver can decode, with very high probability, 
any dropped data packets if sufficient number of linearly independent coded 
and data packets are received, as the field size from which the coding coeffi- 
cients are drawn increases. Hence, in the rest of the work, we will make the 
following assumption as a simplification. 

Assumption 1. Suppose W data packets are used to generate coded packets 
via RLC. If G coded packets are received by the TCP destination, then upto G 
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missing data packets out of the W data packets can be recovered. 

Thus, if the number of missing data packets from W exceeds G in an 
RTT slot, the congestion window will halve (and ldrop = 1)- That is, if G coded 
packets received and as long as no more than G data packets of the original 
W data packets are lost, then the receiver can recover all of the original data. 
For detailed exposition on RLC and justification of assumption 1, see [46]. 

2.2.2 Network Topology 

The network topology we consider in this chapter is a TCP-RLC con- 
nection made over M > 1 paths going through router R\,..., Rm, each with 
capacity C (black connection in Figure 2.2). Only the link between R i} 
i = 1,...,M and the destination is wireless; the links between the routers 
and the source are wired. The well-provisioned wired network is assumed with 
the paths having the same RTT, and the wireless routers do not store packets 
from one RTT slot to another in the analytical mode. In practice, the paths 
will have different, but similar, RTT values. 

We assume that the wired section of the network has greater capacity 
than the wireless section. This is a reasonable assumption, since all of the 
currently existing downlink networks have wired back-plane network with ex- 
cess capacity, and the wireless downlink channel is greatly limited due to tight 
spectrum and power constraints. 
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2.2.3 Wireless Downlink Channel 

We model packet drops in the wireless channel between a wireless router 
and the TCP-RLC destination as a simple i.i.d. packet drop process whose 
parameter remains constant for each RTT-interval. This is similar to the 
block-noise model common in wireless communication literature. 

Within each RTT-interval t, the probability that a packet transmitted 
over the air by the wireless router i e {1, M} for the destination is success- 
fully received is given by Pi(t). The j-th packet transmitted over the air by 
the wireless router % for the destination is corrupted (dropped) according to a 
Bernoulli error process H l -{t) defined as 

— / ^ W 'P' Pitt) *f -7 th P ac ket is received correctly 
J \ w.p. 1 — Pi(t) if j th packet dropped 

with parameter Pi(t) G {p\ = p m i n ,P2, ■ ■ ■ ,Pu = Pm&x} acting upon each packet 
over the air independently of other packets in the same RTT-interval. We 
assume that p\ > and p\ < p 2 < ■ ■ ■ < pu- We assume that Pi(t) changes over 
time with F(pi(t) = pt) = Pk- Thus, the channel packet-delivery-probability 
parameter itself changes with time (this corresponds to changing fading state 
over time), and at any time the actual packet delivery probability depends on 
the instantaneous value of this (random, time-varying) parameter. 

Lastly, we assume that r > 2(1 —pmin)/Pmm for multiple paths topology. 
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2.2.4 Priority Transmission 

In the multiple path network topology, the source maintains a path- 
level congestion window wi(t) for each path /; this is in addition to W(t). The 
idea in our multi-path algorithm is to have a rate control on each path. The 
transmission rate of the high priority packets on path / is controlled by wi(t). 
wi(t) will be the number of high priority (data) packets in transit towards 
the destination in time slot t; in addition, there will be rwi(t) low priority 
(coded) packets that will be in transit as well. If wi(t) packets (either high or 
low priority) are successfully received by the destination, the rate controller 
on path I at the source will increase wi by one so that in time slot t + 1, 
wi(t + l) — wi(t) + 1. If not, then wi will be halved, so that wi(t + l) = wi(t)/2. 

The main idea in our algorithm is to separate the problem of wireless 
link reliability on each path from the TCP rate control algorithm that operates 
over all paths (W(t)). On each path, the only sign of congestion is if the total 
number of packets (high + low priority) received by the destination on path 
/ is less than the number of high priority packets it should have received on 
that path; that is, the router on path / did not have enough capacity to either 
1) send all high priority packets it received or 2) send enough low priority 
coded packets to compensate for any high priority packets that were dropped 
by the wireless link. Unlike in the of the traditional TCP over wired links, 
gaps in the sequence of the received packet numbers are not a valid way of 
measuring congestion over wireless links. The mechanism that acts on wi(t) 
we described in the previous paragraph is a way to "measure" congestion on 
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each path; wi(t) is the number of packets that path / can guarantee to deliver 
in time slot t to the destination through the wireless router I. 

Thus, in time slot t, path I can have any wi(t) of the (1 + r)W(t) data 
and coded packets delivered to the receiver. The evolution of wi(t) is similar 
to W{t): 

wi(t + l) = 1 success 

(Wi(t) + 1) + lfail \Wi(t)/2] (2.3) 

where l SU cccss = 1 if wi(t) packets are successfully received in RTT slot t; 

lfail 1 1 success- 

Before the packets are sent out on a path, they are marked either high 
or low priority. The number of high priority packets sent out in any given 
RTT slot is equal to wi(t). For each high priority packet sent out, the path 
transmits r low priority packets. 

Note that as long as W(t) data packets are successfully received (or 
recovered), W(t) will increase according to eq. (2.1). Since the available rates 
on the M paths (wi(t), W2(t), wuif)) evolve independently, we need a way 
to measure the total available instantaneous rate through all paths combined, 
and the main purpose of having W(t) is to measure this total, combined rate 
through all paths. The algorithm at the source is designed such that in time 
slot t, the main rate controller will take W(t) data packets, generate rW(t) 
coded packets, and leave both types of packets in some memory location. 
Path- wise rate controller on path I will take wi(t) data packets, mark them 
high priority, and send them out along with rwi(t) coded packets, which it 
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will mark low priority. If in time slot t, W(t) data packets are successfully 
reconstructed at the destination, the main controller will deduce that the total 
available rate through all paths is equal to or greater than W(t), and hence 
W(t + 1) = W(t) + 1. Else, it will deduce the total available rate is less, and 
W(t + 1) = W(t)/2. 

2.2.5 Wireless Router 

We assume that the low priority packets are transmitted by the router 
only when there are no high priority packets that can be transmitted. For 
example, in the case M = 1, the remainder of the nominal channel capacity 
C, C — Wi(t), is used to transmit low priority packets in t-th RTT slot, where 
the evolution of w\{t) is defined in eq. (2.3). 

We also make the assumption that the wireless router maintains a pair 
of queues for each TCP-RLC connection made through that router. While 
such a number that scales with the number of flows would be prohibitive for 
routers in the Internet core, we argue that it is reasonable for wireless downlink 
routers, as these routers serve relatively small number of mobile users in the 
same cell. In addition, per flow queue maintenance is already done in cellular 
architectures for reasons of scheduling, etc. (see [9]). 

For each TCP-RLC connection, one queue (FIFO) is used to handle 
high priority packets; the other queue (LIFO) is used for low priority packets. 
(LIFO queue is used for low priority packets because the low priority packets 
are transmitted only when the router has excess capacity. The router will 
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run out of capacity often enough so that if FIFO is used for low priority 
packets, then after long enough time, it will be backed up with old low priority 
packets and any newly arriving low priority packets will be dropped due to 
(low priority) buffer overflow.) In our analysis, we assumed that packets are 
not stored in the router's queues from one RTT time slot to another. We relax 
this in our simulations. 

2.3 Multiple Path Analysis 

We first analyze the evolution of path-level congestion window. Given 
that the path-level window size is wi(t) for path / in time slot t, wi(t + 1) = 
\wi(t)/2~\ if out of the C packets transmitted by the router, fewer than wi(t) 
packets with the same block number are received; let Xia.n be this event. 

Using Chernoff's bound, we can show for any e > 0, 

P [XfaiiW*) = Pi, wiit) < (1 " e)PiC\ < exp(-eVC/2) 
Similarly, we can show that 

P [XfaiihO) = Pi, Mi) > (1 + > 1 - exp(-e 2 p 1 C/2). 

In addition, it is straight forward to show that for any i = 2, II, 
P [XfeabiW = P l , vn(t) < (1 - e) Pl C] < P [Xfeiibt® = V i, Wi{t) < (1 - e) Vl C) 
since pi < Pi- Thus, 

n 

P \xm\w t (t) < (1 - e) Pl C] < ^p J exp(-eVC/2) 

i=i 

= exp(-e 2 Pl C/2) (2.4) 
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and 

P [xfauMO > (1 + > P [Xfaabi(*) = Pi, > (1 + e)piC] 

= p 1 (l-exp(-eVC , /2)). (2.5) 

From eqs. (2.4), (2.5) and (2.3), we see that 0.5(1 - e)p 1 C < Wi(t) < 
(1 — e)p\C w.p. at least 1 — exp(— e 2 piC). Since (1 + r)wi(t) high and low 
priority packets are sent on path I and r > 2(1 — the wireless router 

on path I sends C in each time slot w.p. at least 1 — exp(— e 2 p\C). 

In a network with M paths, the probability that 0.5(1— e)p\C < wi(t) < 
(l-e)piC, V/ = l,...,Mis at least (1 -exp(-e 2 Pl C)) M > 1 - M exp(-e 2 Pl C) 
where C » M. Thus, we make the following reasonable assumption: 

Assumption 2. On each path I = 1,...,M, the wireless router transmits C 
(high + low priority) packets to the TCP-RLC destination in each time slot. 

From assumption 2, we have that W(t) is an irreducible, aperiodic 
Markov chain; W(t + 1) depends only on W(t) and pi(t), I = 1, M. Let n w 
denote its stationary distribution. 

Let xkiiuxe = {Zti E?=i H*(t) < W(t)}. Fix p > 1 and let T denote 
the event {^££iP^) > «)}. Then 

/chan(^(t)) = P [T] P fcUJT] + P [^ T ] P [XfailurehT] . 

Lemma 1. / c han(^) can be bounded as below 

iexpi-MCL,) i f0<^< Pl 
jchanl J " \ exp (-MCL 2 ) + exp (-Ml' (l - *M < f£ < E [P] 
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where L\ and L2 are non-zero constants, and 



l'(a)= max {9a - log M'(6)} 

— oo<0<oo 



(2.6) 



and M'(Q) = E [e e ( 1 - p )] . 



Proo/: Let p(t) = ^ Pi(*)- If Pi < < E [P], by Chernoff's bound 



P[-.T] 



P 



1 M 

_^(l-p( M ))>l 

, -p(-^(i-^) 



pW(t) 
MC 



Then, 



PfXfailurelT] < P 



Xfailurc {Pit) 



pW(t) 



< 



e( _MCD(l-^||l. 



MC 

pW(t) 
MC 



(2.7) 
(2.8) 



where D(x\\y) is the Kullback-Leibler distance between x and y 1 ; eq. (2.7) 
follows because lowering p(t) will increase the probability that the transmission 
will not be successful, and eq. (2.8) by Chernoff's bound. 

Let Li = ini w , pi ^ mp] D(l - jg). Thus, if Pl < < 



E [P] , we have 



/chan(^) < exp (-Ml' - ) ) + exp (—MCLi] 
since P[T],P[ X ' failurc hT]<l. 



^(xlly) = .t \og(x/y) + (1 - x) log((l - a?)/(l - y)). 
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If p ^ff 1 < Pi, we have P [T] = 1 because p(t) >p\. In addition, 

PfXfailurelT] < P [xUrelKO = Pi] (2-9) 

< e (-wou(i-^||i- P1 ))_ (21Q) 

Inequality (2.9) follows from the fact that decreasing the probability that iJj = 
1 will increase the probability that not enough packets will be received by the 
receiver to decode w packets. Inequality (2.10) follows from Chernoff bound. 
Let L 2 = mf w .^ e[0pi) D (l - ^||1 - Pl ). Then 

P[x / fai iu re |T]<exp(-MCL 2 ), 

and 

/chanW <exp(-MCL 2 ) 
in the region p ™^ < p\ since P [-iT] = ■ 
Theorem 1. Fzx p > 1 and M. Then, 

E[W*(t)} > max ^0.75 Pl MC/p 2 - 1, min | E [ P ^ C [l -3^-2 (e' 1 + 6 2 )] , 
/3 [l -3L/3J- 1 -2 (e^ 1 + <5 2 )]}} 

w/iere 5i , 5 2 ->■ as C -> oo and where (5 e [p x MC j p 2 , E [P] MC/p 2 ] is the 
solution to the equality 

1/(3 = exp (-Ml'(l - pP/MC)) (2.11) 

(for M, C large enough, we can show a solution exists) and where /'(■) is 
defined in equation (2.6). 
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Proof: If 3/3 G [p\MC/ p 2 ,E [P] MC/p 2 ] such that (3 is the solution to eq. 
(2.11), then this case corresponds to the situation when there are enough 
paths to start gaining path diversity, but not enough paths to gain complete 
path diversity. (We assume C is large enough so that I' {I — E [P] / p) < CL.) 

Let 



f(w) = < WchanW L ^ J 

if u; > L/3J 



1 



(2.12) 



so that f e s(w) = 2(3 1 if w < [(3\ and 1 if w > [(3\. Under this AQM marking 
scheme, 

E[W(t + l)} = E[W(t) + l]-E[l dmp [l + W(t)/2\] 
1 = E[f eS (W)[l + W(t)/2\] 
1 < E[f cS (W)(l + W(t)/2)} 
1 < E[f cS (W)]+E[f cS (W)W/2}. 



and 



Thus, 



E 



E [UiW)] = 2/T 1 + (1 - 2/T ( L/3J ) 



w 



r'E [W] + (1 - 2/3- 1 )i^7r w (L/9J) . (2.13) 



Let ,S fc (t) = 1 if = k and if W(t) ^ fc. For > /2J , we have 

P [£*(*)] = P[S fc _i(t-l)]P[W(t) = fc|W(t-l) = A;-:L] 
= (l-2/3" 1 )P[^i(t-l)]. 
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Thus, 

[P 

P[S m (t+\[/3\/2])] = Pftl/y/aj-iC*)] II P[W = *|W = *-1] 

fc=LL/3J/2j 

= (i-2(/3)" 1 )r^/2i P [ 5LL/3J/2J _ l( t)]. 

Let 7|l/3J/2J-i be the time takes W(t) to return to [_|_/^J/2J — 1- It i s 
known that 

P [5 LL/3J/2 j_i(t)] = %/3j/2j-i [T LL/3J/2 j_i] 1 . 

If starting from state |_ |_/5J / 2J — 1, the window size increases by k > 
before halving, the total number of steps before W(t) returns to |_I_/^J/2J — 1 
is at least k + [[(3\/2\ - 1 - [(LL/3J/2J -l + k)/2]. Minimizing over k, the 
least number of steps before W(t) enters state LL/^J/^J — 1 again is at least 
KLIAIAI " 1)/2J- Thus, E LLSJ/2J _ 1 [T LL ^j /2j _ 1 ] > L( LL/5J /2J - l)/2j and 

P[^j/2j-i(*)] < (LCLL3J/2J - l)^)- 1 . 
So we have 

^ M£ ie™ (2 ' 14) 

Combining above with equation (2.13), we get 

fl - 2/5" 1 "! TL/3J/21+1 



(LCLL/5J/2J - l)/2j) 

- 1+ (L(LL/3J/2J-i)/2J) 

< 3 (L/5J)- 1 - 
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Combining equation (2.13) with equation (2.14), we get 
W 



E 



fes(W)- 



p-'E [W] + (l-2(3- 1 )^r7r w ([(3\) 



< P~ 1 E[W} + 



\fi\ (1 — 2/9" 1 )n/3J/2l+i 

~(L(LL/3J/2J-i)/2J) 



Since (1 — 2(3 1 y\-^i/ 2 \+ 1 converges to e 1 , we have 



E 



< /3 _1 E [W] + 2(e _1 + 6) 



where 5 is some small, positive constant. Combining the bounds on E [/^(W)^] 
and E [/eg(W)] and substituting into inequality (2.13), we get 

1 <3(L/3J)" 1 + /3- 1 E[^] + 2(e- 1 + 5). 

Solving for E[W] and using E [W*(tj\ > E [W{t)\, we get 

P [l - 3L/3J -1 - 2 (e^ 1 + <J 2 )] < E[W*]. 

If ^ G [piMC/p 2 , E [P] MC/p 2 ] such that 1//3 = exp {-Ml' (I - p/3/MC)), 
then either 1//3 < exp (-MZ'(1 - pP/MC)) for /3 = Pl MC/p 2 , or 1//3 > 
exp (-MZ'(1 - pP/MC)) for /3 = E [P] MC/p 2 . 

If 1//3 < exp (-M/'(l - pP/MC)) for /3 = Pl MC/p 2 , then there are not 
enough paths to gain path diversity. We can use no AQM (i.e., f(W) = 0, VW), 
and we will be guaranteed throughput at least 0.75piC/ p 2 — 1 as C — > oo. 

If l/P > exp {-Ml' {I - pP/MC)) for P = E [P] MC/p 2 , then there are 
sufficient number of paths to gain all path diversity. 
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Let /3 = (^jp-MC) and let 



2/3 '-/chant") Hw<8 
f(w) = -\ WchanH P 
1 if W > /?, 

so that fes(w) = if w < (3 and 1 else. We can follow the same analysis 
as in the case when the solution to eq. (2.11) exists (just let (3 be equal to 
E [P] MC I p 2 and the result will follow through) and arrive at the conclusion 
that 

E [W*(t)} > E [P] p2 MC [1 - 3L/3J- 1 - 2 (e- 1 + 5 2 )] . 



Due to the fact that W[(t) < piC and r > 2(1 — p±)/pi, the capacity of 
the wired section only need to be twice that of the wireless section; i.e. the 
wired capacity needs to be slightly greater than 2C. 

2.4 Simulation Results 

Before we present our simulation results, we note that implementation 
of TCP-RLC uses novel techniques not found in original version of TCP. We 
provide brief description of some of the novel techniques in the appendix; we 
refer any readers interested in the implementation details to the appendix. 

We simulate single flow with multiple paths, where we exploit path 
diversity. In our simulations, the random coefficients used to encode each 
packet are drawn from a field of size 8191; thus, for coding block size of smaller 
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than ~ 500, the probability of two coded packets in the same coding block not 
being "independent" is negligible. We fixed the size of all packets to 256 bytes. 
Each flow had two buffers allocated at the router whose capacities were equal 
to the transmission rate times the RTT. We have used bimodal channel profile, 
{(Pmin,Pmin), bma, zw) } . p max is set to 1 in all our simulations. We varied p 1 
from 0.1 to 0.5 in increments of 0.1 and set p\ = 0.1 and P2 = 0.9. Thus, E [P] 
varied from 0.91 to 0.95. This corresponds to the scenario where the downlink 
channel can be controlled to provide good capacity most of the time (90% of 
the time), but bad channel conditions can occur frequently enough to destroy 
TCP throughput (10% of the time). Note that using fixed coding rate adjusted 
for the average channel quality would not work for this scenario; coding is 
not needed most of the time and becomes useless when needed because the 
coding rate is insufficient against poor quality channel conditions that can 
occur frequently enough. We varied the number of paths from 1 to 8. We set 
the total capacity to 1Mbps, so that per path capacity is IMbps/M, where M 
is the number of paths in our simulation. In our simulations, all paths have 
the same bimodal channel profile and have the same capacity. (Our simulation 
scenario corresponds to the situation when the quality of the wireless channels 
can be controlled (for example, by increasing transmission power temporarily) 
upto certain degree to be good enough most of the time, but there are occasions 
when the channels are so degraded that nothing can be done.) 

Each time the wireless channel changed, it stayed constant for some 
random time according to the uniform distribution with parameters 100ms to 
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200ms; on average, the channels stayed constant for 150ms. RTT for each 
path was drawn randomly from uniform distribution [100ms, 200ms]. Time- 
out clock is set to expire after 3 x measured RTT. RTT was measured using IIR 
filtering: measured RTT = 0.9 x old measured RTT + 0.1 x new measured 
RTT. The redundancy factor r was such that (1 + r)/p m ; n « 2, with r being 
an integer. We assume perfect uplink channel from the mobiles to the wireless 
routers for the end-to-end TCP ACK's. 

We used no AQM; the presence of AQM were assumed mainly to sim- 
plify our analysis. Theorem 1 says that the TCP throughput achieved by 
using the AQM function in Eq. (2.12) provides a lower bound on the optimal 
TCP throughput. When the AQM function in Eq. (2.12) is used, the packet 
drop probability is deliberately increased (compared to when no AQM is used). 
Though we have no mathematical proof that TCP throughput achieved by not 
deliberately increasing the drop probability is larger than that achieved by de- 
liberately doing so, we believe that using the AQM function in Eq. (2.12) and 
deliberately increasing the drop probability only decreases the throughput. 

The average throughputs we obtained are shown in Figure 2.3 as a func- 
tion of the number of paths. As the number of paths increases, the average 
throughput increases towards E [P] CM (CM is fixed) in a concave manner, 
indicating that going from one path to two paths gives much gain in through- 
put, especially when p min is small. The throughput should reach 700Kbps (for 
E [P] = 0.91) to 730Kbps (for E [P] = 0.95). 
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Figure 2.3: Simulation results: Average throughput vs. no. of paths for multi- 
ple path TCP-RLC connection. Note that the throughput of M separate tra- 
ditional TCP connections (whether over wired or wireless downlink networks) 
does not increase with the number of paths. In addition, TCP throughput 
over wireless downlink network with fixed FEC coding rate is 0(1). Thus, the 
throughput of the traditional TCP over wireless downlink network where the 
channel fluctuation changes at the time scale of the RTT is low and does not 
increase with the number of paths. 



2.5 Appendix 

2.5.1 TCP-RLC Protocol 

In this subsection, we give a brief description of NS-2 simulation im- 
plementation details. We break the description of TCP-RLC into three sub- 
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sections. 



2.5.1.1 Source Architecture 

ACK and pseudo ACK TCP-RLC uses two types of ACK's: ACK, as 
used in the plain TCP and pseudo ACK, which we describe here. 

Plain ACK's cumulatively acknowledge the reception of all packets with 
frame numbers smaller than or equal to the ACK. 

In our context, a lost data packet can be "made up" by a future coded 
packet, and we would like the sliding congestion window to slide forward and 
have delay-bandwidth product worth of packets in transit. Thus, we use a 
strategy similar to that in [71] , where degrees of freedom are ACKed. In our 
context, we refer to this as a pseudo ACK, which simply ACKs any out of order 
data packet or coded packet that helps in decoding the smallest-index missing 
packet (e.g., if the sink has received packets 1, 2, 3, and 7, the smallest-index 
missing packet is 4). Note that with regular TCP, out-of-order packets would 
trigger duplicates ACK's that would lead to a loss of throughput. 

To summarize: ACK - cumulatively acknowledges in-order packet ar- 
rivals; pseudo ACK - acknowledges out-of-order data packet/coded packet 
arrivals that help in decoding the missing packets. 

Sliding congestion window The source maintains a congestion window W, 
which is the same as the size of the coding block for TCP-RLC. All packets 
transmitted are marked either high or low priority; the source allows only W 
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Figure 2.4: Evolution of congestion window: the total number of packets in 
transit is composed of cwnd high priority packets (data packets) + redundant 
packets (RLC encoded packets). The number of redundant packet is some 
multiple of cwnd, and has to be greater than pi. Note that without priority 
transmission at the wireless router, this will reduce the throughput. 



high priority packets to be in transit. Each time a high priority packet is 
transmitted, it transmits r low priority packets as well, where r > l/p m i n — 1. 

The source maintains variables, last-ACK and SN. All packets with 
frame numbers lower than last-ACK are assumed to have been successfully 
received. SN is the frame number of the starting packet in the coding block 
currently being transmitted. If pseudo ACK or ACK arrives "acknowledging" 
the reception of SN, a new coding block is encoded and readied for trans- 
mission, with the coding block size being W + 1 packets. This is because 
acknowledgment of packet number SN implies a RTT has been elapsed, which 
is enough time for W packets to have been successfully received and decoded 
by the sink. 
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Multiple paths When multiple paths are used, the marking of packets is 
done by the paths independently. Each path is maintained by a path controller 
and the controller maintains a congestion window, cwndf, another top-level 
controller maintains cwnd, which is used as the size of the coding block. After 
the packets are encoded, they are passed to path controllers (thus, coded pack- 
ets are "mixed" across paths, which in-turn leads to statistical multiplexing 
across paths). Each path marks packets either high or low priority. In each 
RTT slot, the number of high priority packets in transit is equal to cwndi. The 
number of low priority packets (coded packets) per path is equal to r x cwndi. 
Each packet going out on a path contains the block number and block size, 
which is also equal to cwndi. I n each RTT slot, if the sink on path i receives 
cwndi packets (either high or low priority), cwndi increases by one; else cwndi 
reduces by half. Note that single path is just a special case of multiple paths; 
and the variables cwnd and cwndi are the same. (In the single-path case, 
the role played by the separate path controllers is subsumed into the source 
controller.) 

There are two levels of ACK's; one level for source controller (source 
level ACK, consisting of ACK and p-ACK) and the other for path controllers 
(path level ACK). Note that: (i) path level ACK is a new ACK introduced 
for multi-path TCP - there is no equivalent in the single-path case, and (ii) 
the three types of ACKs described here is abstracted into a single indicator 
function for success/drop in the analysis (see eq. (2.1) in section 2.2). Source 
controller level ACK's affect cwnd and moves the coding window; path con- 
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troller level ACK's affect cwnd^s and moves the block numbers. 

2.5.1.2 Destination Architecture 

Upon reception of a packet, the destination examines if it is the next 
expected (i.e., smallest-index missing) packet. If it is, the destination sends 
an ACK cumulatively acknowledging all packets upto and including the just 
received packet. If not, the destination sees if the packet is an innovative packet 
that can be used to decode the next expected. (A packet is innovative if it is 
linearly independent of all packets received so far, i.e. it can help in decoding 
the next expected packet. For a complete definition of innovative packets, see 
[46] and [29].) In case the packet is helpful, the destination sends a pseudo 
ACK with next expected packet number + total number of innovative packets 
accumulated that can help in decoding the next expected packet. If the packet 
does not help in decoding the next expected packet, the destination sends a 
duplicate ACK. 

2.5.1.3 Wireless Router Architecture 

As mentioned, the wireless router maintains two buffers for each flow. 
A FIFO buffer is maintained for high priority packets; a LIFO buffer is main- 
tained for low priority packets. The FIFO buffer does not need to be large, 
but large enough to handle packet processing execution. The LIFO buffer 
needs to be large enough to handle RTT worth of packets. In our implemen- 
tation, we do not have an explicit AQM mechanism at the router (other than 
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tail-drop); however, even without AQM we observe large performance gains 
through simulations. 

2.5.1.4 Illustrative Example 

We illustrate the additive increase, multiplicative decrease component 
of TCP-RLC and the pseudo ACK's using three examples. The examples are 
for when a TCP connection uses a single path, and the data and coded packets 
are marked high and low priority, respectively. Although the TCP source uses 
retransmission time-outs for when there is no response from the sink for long 
period of time, we do not show this in our examples. 

1. Additive increase: When there is no congestion at the wireless router and 
the sink receives enough data and coded packets in a coding block, the 
sink will be able to recover any missing data packets. In Figure 2.5(a), 
the four packets P1-P4 are encoded together. While PI is successfully 
received by the sink, P2-P4 are dropped/lost due to bad wireless channel. 
However, the sink has received enough coded packets to recover packet 
P2-P4. As the sink receives these coded packets, it sends out a pseudo 
ACK for each one. This enables the source to move the congestion 
window forward, keeping the "pipe" between the source and the sink 
full. When the sink recover P2-P4, it sends out an ACK acknowledging 
the successful reception of packets upto P4. Note that the congestion 
window is increased by one packet, and thus new packets P5-P9 are 
encoded together. 
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2. Multiplicative decrease (bad wireless channel): When too many packets 
that belong in the same coding block are dropped due to bad wireless 
channel, duplicate ACK's will be triggered when packets that do not help 
in decoding the next expected packet. In Figure 2.5(b), data packet PI, 
P3 and P4 are dropped and not enough coded packets arrive due to 
bad wireless channel. When packets P5-P7 arrive at the sink, duplicate 
ACK's are sent out and the source will cut the congestion window. Note 
that when p-ACK 1 is received by the source, a new coding block P5-P9 
is encoded and passed to path controllers. However, when the duplicate 
ACK's are received, the source decreases the congestion window by half 
to two packets, and restarts transmission starting from PI. 

3. Multiplicative decrease (congestion at wireless router): Where there is 
a congestion at the wireless router, the router will be busy transmitting 
data (high priority) packets, and no or very few coded (low priority) 
packets will be transmitted. In Figure 2.5(c), packet P1-P3 are success- 
fully received, but no coded packets are received to help the sink recover 
the packet P4, which has been lost due to then bad wireless channel 
condition. Thus, the sink will send out duplicate ACK's and the source 
will cut the congestion window and restart transmission starting from 
P4. 
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Figure 2.5: Illustrative example 



Chapter 3 



Time-Scale Decoupled Routing and Rate 
Control in Intermittently Connected Networks 

3.1 Introduction 

There recently has been much interest in intermittently connected net- 
works (ICNs). Practical use of such networks include military scenarios in 
which geographically separated clusters of soldiers are deployed in a battlefield. 
Each cluster is connected wirelessly internally, but the clusters of soldiers rely 
on unmanned aerial vehicles to transport battlefield information between clus- 
ters. In such a case, building a communication network can take too much time 
as these combat units must be deployed rapidly, and wireless connections are 
often susceptible to enemy jamming or the communication/RF ranges might 
not be large enough. 

Another scenario of interest is a sensor network composed of multiple 
clusters, with each cluster containing low power sensor nodes. The data col- 
lected by the sensor nodes must either be transmitted to a data fusion node 
or to another cluster. To do this, the sensor nodes rely on mobile nodes that 
provide inter-cluster connectivity; thus resulting in a network with multiple 
time-scales and intermittent connectivity. 
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In this dissertation, we consider a network of clusters of nodes connected 
via "mobile" nodes (see Figure 3.1 for an example). Internally, each cluster has 
many nodes connected via a (multi-hop) wireless network. Each cluster has at 
least one gateway node. (We will call the other nodes in the cluster internal 
nodes.) These gateways are the designated representatives of the clusters, and 
they are the only ones able to communicate with the mobiles - traffic from 
one cluster to another cluster (inter-cluster traffic) must be funneled through 
the gateways, both in the source cluster and in the destination cluster. The 
mobiles and gateways exchange packets (pick-ups and drop-offs) on contact. 
Each contact is made over a high capacity link and is long enough for a large 
quantity of data to be exchanged. The mobiles then move between clusters, 
and on contact with a gateway in the destination cluster, packet drop-offs are 
made. 

A key challenge in the network above is the fact that intermittently 
connected networks have several time-scales of link variability. For instance, 
wireless communication between soldiers within the same cluster is likely to 
occur at a time-scale several orders of magnitude faster than communication 
across clusters (which needs to use the mobile carriers). In this context, there 
are essentially two time-scales: (a) within a cluster, where wireless links are 
formed in an order of tens of milliseconds, and (b ) across clusters where the 
time-scale could be tens of seconds, to minutes. To communicate from one node 
to another node in the same cluster poses no significant problem - one can use 
existing protocols such as TCP. However, for two nodes in two different clus- 



43 



ters to communicate, they must use the mobile nodes, as these mobiles move 
between clusters to physically transport data. Hence, the mobile communica- 
tion time scale is many, many times greater than the electronic communication 
time scale. Any communication protocol that relies on fast feed-back (in the 
order of milliseconds to tens/hundreds of milliseconds) incurs severe perfor- 
mance degradation. The mobiles may be able to transport a large quantity of 
data (of the order of mega or giga bytes) in one "move," but to move from one 
cluster in one part of a network to another part still takes time (of the order 
of seconds to minutes). 

The design and development of communication protocols for intermit- 
tently connected networks, therefore, must start with an algorithm with as 
few assumptions about the underlying network structure as possible. The 
back-pressure (BP) routing algorithm [73] was introduced nearly two decades 
ago by Tassiulas and Ephremides with only modest assumptions about the 
stability of links, their "anytime" availability, or feasibility of fast feed-back 
mechanism; yet remarkably, it is throughput optimal (throughput performance 
achieved using any other routing algorithm can be obtained using the back- 
pressure algorithm [73]) as well as resilient to changes in the network. The BP 
routing algorithm is a dynamic routing and scheduling algorithm for queuing 
networks based on congestion gradients. The congestion gradients are com- 
puted using the differences in queue lengths at neighboring nodes (the routing 
part). Then, the back-pressure algorithm activates the links so as to maxi- 
mize the sum link weights of the activated links, where the link weights are 
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set to the congestion gradient (the scheduling part). Over the years, there 
has been continued effort to further develop back-pressure type algorithms to 
include congestion control and to deal with state-space explosion and delay 
characteristics [2, 3, 23, 47, 48, 50, 68, 78, 81, 82]. 

However, the traditional back-pressure algorithm is impractical in in- 
termittently connected networks, even though it is throughput optimal. This is 
because the delay performance and the buffer requirement of the back-pressure 
algorithm in the heterogeneous connectivity setting of the ICNs increase with 
the product of the network size and the time scale of the intermittent con- 
nections - i.e., the larger the network or more intermittent and sporadic the 
connections, the larger the delay and buffer requirement. However, we believe 
that the back-pressure algorithm is a reasonable starting point for developing 
rate control/routing protocols for intermittently connected networks. In this 
dissertation, we design, implement and evaluate the performance of two-scale 
back-pressure algorithms specially tailored for ICNs. 

3.2 Related Works 

The back-pressure algorithm and distributed contention resolution mech- 
anism in wireless networks, in one form or another, have been studied and im- 
plemented in [4, 34, 41, 47, 55, 56, 67, 78]. [78] improves TCP performance over 
a wireless ad-hoc network by utilizing the back-pressure scheduling algorithm 
with a backlog-based contention resolution algorithm. [55] improves multi- 
path TCP performance by taking advantage of the dynamic and resilient route 
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discovery algorithmic nature of BP. The authors in [47] have implemented and 
studied back-pressure routing over a wireless sensor network. They have used 
the utility-based framework of the traditional BP algorithm, and have de- 
veloped implementations with good routing performance for data gathering 
(rate control is not studied in [47]). Their chief objective is to deal with the 
poor delay performance of BP. [67] is an implementational study of how the 
performance of BP is affected by network conditions, such as the number of 
active flows, and under what scenarios backlog-based contention resolution al- 
gorithm is not necessary. [41] studies utility maximization with queue-length 
based throughput optimal CSMA for single-hop flows (with no routing or inter- 
mittent connectivity). More recent works on contention resolution mechanism 
are [4,34,56]. In [4], a queue-based contention resolution scheme is proposed; 
however, the proposed algorithm only uses the local estimates of the neighbors' 
queue lengths to change the contention window parameter of IEEE 802.11, un- 
like the original back-pressure algorithm which requires explicit neighborhood 
queue length feedback. The authors of [56] proposed another form of queue- 
based contention resolution algorithm, which they conjecture does not require 
any neighborhood queue length feedback or message passing; however, the al- 
gorithm in [56] requires larger buffers, resulting in long delays. Channel access 
mechanism not based on queue lengths is proposed in [34]. Here, the authors 
propose a distributed time-sharing algorithm that allocates time slots based 
on the number of flows in the wireless network. Lastly, [2] is not an imple- 
mentation, but discusses a lot of issues related to BP routing with rate control 
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implementation. Our study differs from all of them in that we focus on the 
multiple time-scales issue in an intermittently connected network (thus, queues 
throughout the network get "poisoned" with the traditional BP), and study 
modifications that loosely decouple the time-scales for efficient rate control. 

A cluster-based back-pressure algorithm has been first studied in [82] 
to reduce the number of queues in the context of traditional networks. How- 
ever, the algorithm as proposed in [82] in general does not separate between 
the fast intra-cluster time scale and the slow inter-cluster timescales due to in- 
termittently connected mobile carriers, thus leading to potentially large queue 
lengths at all nodes along a path. (We will demonstrate this in Section 3.3.) 
Our proposed algorithms in this dissertation explicitly decouples the two time 
scales by separating the network into two layers, with each layer operating its 
own back-pressure algorithm, and by allowing the two layers to interact in a 
controlled way at nodes that participate in inter-cluster traffic. This explicit 
separation is the key property that leads to much smaller buffer usage and 
end-to-end delays and more efficient network resource utilization. 

Initially, the approach taken in intermittently connected networks and 
DTNs for routing was based on packet replications. The simplest way to make 
sure packets are delivered is to flood the "mobile" portion of the network 
so that the likelihood of a packet reaching the destination increases as more 
and more replicas are made [77]. A more refined approach is to control the 
number of replicas of a packet so that there is a balance between increasing 
the likelihood and still leaving some capacity for new packets to be injected 
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into the network [8,15,43,66,77]. Another refined approach is to learn the 
intermittently connected topology and use this knowledge to route/replicate 
through the "best" contacts and encounters and avoid congestion [33,35,42, 
64,74]. 

[27, 51, 79] study networks that are closer to ours. In [79], distant groups 
of nodes are connected via mobiles, much like our network but with general 
random mobility. At the intra-group level, a MANET routing protocol is used 
for route discovery, and at the inter-group level, the Spray-and-Wait algorithm 
[66] is used among mobiles to decrease forwarding time and increase delivery 
probability. [51] augments AODV with DTN routing to discover routes and 
whether those routes support DTN routing and to what extent they support 
end-to-end IP routing and hop-by-hop DTN routing. [27] studies how two 
properties of the mobile nodes, namely whether a mobile is dedicated to serve a 
specific region (ownership) and whether the mobile movement can be scheduled 
and controlled by regions (scheduling time), affect performance metrics such 
as delay and efficiency. 

Because replication-based algorithms inject multiple copies of a packet, 
they suffer from throughput drops. However, all the aforementioned replication- 
based algorithms are valuable as they provide insight into engineering an ef- 
ficient and robust ICN protocol. There is (to the best of our knowledge) no 
literature on rate control over ICNs. We demonstrate in this dissertation that 
it is possible to obtain utility maximizing rate allocation, even though there is 
the "mobile-gateway" time scale that operates much slower than the wireless 
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Figure 3.1: Our ICN consists of two clusters and one mobile node connect- 
ing these clusters. We have one intra-cluster flow from 2.101 to 2.100, and 
one inter-cluster from flow 1.100 to 2.100, which must rely on the mobile to 
transport data from the left to the right cluster. 

communication time scale, and all inter-cluster packets have to pass through 
the two different time scales. 

3.3 Motivation: Difficulties with Traditional Back-Pressure 

Consider a simple, intermittently connected line network as shown in 
Figure 3.1. We have two clusters geographically separated. We have two 
gateways (1.104 and 2.103) representing the left and right clusters, respectively. 
In the left cluster, we have N c nodes, and in the right cluster, we have three 
nodes. Between these two clusters, we have a "mobile" contact node 0.100 that 
moves from one cluster to the other every ten seconds. On contact, the mobile 
and the gateways (the designated nodes in the cluster that can communicate 
with the mobiles) can exchange a large quantity of packets. Finally, there are 
two flows; an inter-cluster flow originating from 1.100 and an intra-cluster flow 
originating from 2.101; both flows are destined for 2.100. 

In this network, routing is straightforward. But the question here is: 
What is the rate at which these flows can transmit data? An even more basic 
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Figure 3.2: The inter-cluster rate suffers serious throughput degradation under 
the traditional BP with rate controller because the inter-cluster source mis- 
takenly sees the intermittent link as a low-capacity (with low delay) link and 
not as a high-capacity (with high delay) link (see Figure 3.2(a)). Under the 
traditional BP, even if the inter-cluster rate is fixed to the correct value, the 
inter-cluster end-to-end delay grow to be extremely large as the source cluster 
size increases (see Figure 3.2(b)). 
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question is: Can these flows attain high and sustainable throughput 1 , provided 
that the link capacity between the mobiles and gateways is high enough (albeit 
with extreme delays)? How close can we get to the maximum throughput 
allowed by the network? Can we obtain utility-maximizing rate allocation 
over an ICN? What will be the delay performance in such networks? 

The answer to this is clearly negative, if TCP is used for rate control, 
and we shall see that even with traditional back-pressure algorithms [23, 73] 
that have a theoretical guarantee that the above is possible, in a practical 
setting, the answer still seems to be negative! 

To put the above statement in context, we know that the back-pressure 
(BP) routing/rate control algorithm is throughput optimal [73], meaning that 
if any routing/rate control algorithm can give us certain throughput perfor- 
mance, so can the back-pressure algorithm. Contra-positively , if the back- 
pressure algorithm cannot give a certain throughput performance, no other 
algorithm can do so. Further, a rate controller based on utility maximization 
can be added to this framework [23,50,68] that is theoretically utility max- 
imizing, and it chooses rates that (averaged over a long time-scale) lead to 
high and sustainable throughput corresponding to the rates determined via an 
optimization problem [23,50,68]. 

We first consider the performance of a traditional BP based rout- 
ing/rate control algorithm. In Figure 3.2(a), we plot the rate trace of the two 

: By "high" we mean close to the maximum throughput possible, and by "sustainable" 
we mean stochastically stable. 
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(inter- and intra-cluster) flows (with N c = 2). (Figure 3.2(a) is obtained exper- 
imentally. Each source uses the BP congestion algorithm [23,48,50,68] which 
we will describe later). In the figure, we can see that the inter-cluster traffic 
performs very poorly, even though the mobile-gateway contact has enough ca- 
pacity. The reason is simple - the BP congestion control uses the local queue 
length as a congestion signal, and between two successive contacts that can 
be seconds or minutes apart, there is a large queue build-up to the point that 
the inter-cluster source mistakenly believes that the network has low-capacity 
(and low-delay) links. Because of this, the inter-cluster source is not able to 
fully utilize the contacts (see Figure 3.2(a)). 

Importantly, this rate achieved by the inter-cluster traffic is much lower 
than that predicted by the theory (the theory predicts that intra-cluster rate 
is ~ 200KBps, and the inter-cluster rate ~ lOOKBps). This is because the 
theoretical results hold only when the utilities of users are scaled down by a 
large constant - this is to (intuitively) enable all queues in the network to build 
up to a large enough value in order to "dilute" the effects of the "burstiness" 
of the intermittently connected link. However, scaling down the utilities by a 
large constant will result in long queues over the entire network, even at nodes 
that do not participate in inter-cluster traffic. 

Furthermore, even if the inter-cluster source is aware of the presence 
of these intermittent mobile-gateway links and therefore can transmit at the 
correct rate (i.e., a genie computes the rate and tells this to the source), the 
problem can manifest itself in another way. 
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Consider the same network as in Figure 3.1, but we just have the inter- 
cluster flow from 1.100 to 2.100; however, we vary the left cluster size N c from 
10 to 50. See Figure 3.2(b). (Figure 3.2(b) is obtained from simulations.) We 
run the traditional BP algorithm (with no rate controller, as the genie has 
solved this problem) and fix the source rate at 200KBps. In this case, there 
is large backlog that builds up not only at node 1.104 (the intermittently 
connected gateway), but also at every node in the left cluster. 

3.3.1 Main Contributions 

In this chapter, we design, implement and empirically study the per- 
formance of a modified back-pressure algorithm that has been coupled with 
a utility based rate controller for an intermittently connected network. Our 
contributions are: 

1. We present a modified back-pressure routing algorithm that can separate 
the two time scales of ICNs. Separating the time scales improves the end- 
to-end delay performance and provides a throughput arbitrarily close to 
the theory. A key advantage of this modified BP algorithm is that it 
maintains large queues only at nodes which are intermittently connected; 
at all other nodes, the queue sizes remain small. 

2. On top of our modified back-pressure routing algorithm, we implement a 
rate control on our testbed. The essential components of our testbed are 
built with modified MadWifi and Click [39]. The nodes in our testbed are 
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organized into multiple clusters, with intermittent connectivity emulated 
using an Ethernet switch. 

3. Using this testbed, we first show that the traditional back-pressure algo- 
rithms coupled with a utility function-based rate controller is not suitable 
for intermittently connected networks, and leads to a large divergence 
between theoretically predicted rates and actual measurements (shown 
for an intermittently connected line network) . We then present measure- 
ment results using our time scale decoupled algorithm on a line network 
and on a larger sized network. 

4. Finally, we present a practical implementation of shadow queues devel- 
oped by the authors in [14] . Using our implementation, we demonstrate 
there is a nice trade-off between shorter end-to-end inter-cluster delay 
and the network capacity utilization. 

3.4 Network Model 

The time is slotted, with t denoting the t th time slot. The intermittently 
connected network consists of multiple clusters. Each cluster 6; is represented 
by a graph Sei = (Ne l7 £e t ), where 'Ne i is the set of nodes in Cj and £>Q i 
is the set of links. Let C(n) denote the cluster to which node n belongs. 
The clusters are geographically separated, and two nodes in distinct clusters 
cannot communicate with each other directly (they must rely on the mobiles 
to transport data). 
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The clusters are connected by a set M of mobile carrier nodes that move 
around to carry packets from one cluster to another. For each cluster, the set 
of nodes that can communicate with the mobiles is fixed. These nodes are 
named as gateways; those nodes that cannot communicate with the mobiles 
directly are called internal nodes. Let Je 4 and CKg. denote the set of internal 
nodes and gateways in Cj, respectively. 

We use g(i,j) to denote gateway j in cluster Cj. To simplify the nota- 
tions, we assume that gateways g{-,j) have access to mobile m(j) only. The 
mobiles change gateways every T time slots, which is called a super time slot. 
We let r denote the r th super time slot. Here, T is a very large number. A 
time slot is the time scale of one intra-cluster packet transmission, and T is 
the time scale of the mobility (thus, a time slot is roughly a few milliseconds 
long, and T is roughly > 10 3 to reflect the mobility time scale which is seconds 
or minutes long). In our model, the size of the clusters is «T . 

We assume that the mobility of the mobiles follows a Markov process. 
Given that mobile m(j) is at gateway g(i\,j) at the beginning of super time 
slot r, the probability that it moves to gateway g(i 2 ,j) at the beginning of the 
super time slot r + 1 is 

-9&,j) in r+ l\m(j) -g(h,j) in r) 
= P m (j)(ii,i2) 

where "— " means that the mobile and the gateway are in contact. Let P m (j) 
denote the transition probability matrix of mobile m(j) and let n m (j) be the 
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corresponding stationary distribution. The Markov chains are assumed to be 
aperiodic and irreducible. The assumption that g{-,j) only have access to 
m(j) is not necessary; we make this assumption to simplify our notations. 

3.4.1 Traffic Model 

A traffic flow is defined by its source and destination. We assume 
that the sources and destinations are all internal nodes. If the source and 
the destination lie in the same cluster, then the traffic flow is an intra-cluster 
traffic, and the intra-cluster traffic can be routed only within the cluster. If 
the source and the destination lie in different clusters, the traffic flow is an 
inter-cluster traffic. We let [s, d] denote the flow from s to d, $ denote the 
set of all flows, and Sinter and S^ntra be the sets of all inter- and intra-cluster 
flows, respectively. We let be the number of packets source s generates per 
time slot for destination d, x = {x d s : [s,d] G £F}, and x in tr a (e) be the set of 
intra-cluster traffic rates in cluster 6. 

Note that all inter-cluster traffic flows must be forwarded to the gate- 
ways in source clusters, then carried over to the gateways in destination clusters 
via the mobile nodes before reaching their destinations. 

3.4.2 Communication Model 

Let /i(m,n 2 )M denote the transmission rate (packets/time slot) of link 
(rii,n 2 ) at time t, and /2 e Jt] = {A*(ni,n 2 )M> (^1,^2) G L Ci }. Let r Cl be the 
convex hull of the set of all feasible transmission rates in cluster C«. We note 



56 



that in general, jig. [t] and depend on the interference model used for cluster 

We assume that a mobile and a gateway can send R packets to each 
other per contact. We assume that the transmissions between mobiles and 
gateways do not cause interference to other transmissions. 

For an example, see the network depicted in Figure 3.4. Here, we 
have three clusters, with two gateways in each and each cluster connected in a 
grid. We have pairs of nodes in different clusters communicating using the two 
mobiles (inter-cluster flows), which shuffle between these clusters transporting 
data from one cluster to another. In addition, we have intra-cluster flows in 
each cluster. 

3.5 Two-Scale BP with Queue Reduction: BP+SR 

In this section, we introduce our two-scale back-pressure routing al- 
gorithm (which we refer to as BP+SR (Source Routing)) that separates the 
times scales of inter-cluster and intra-cluster connections, while at the same 
time, reducing the number of queues that need to be maintained. We build on 
this two-scale back-pressure algorithm in section 3.6 to implement a utility- 
maximizing rate controller. 

3.5.1 Queuing Architecture 

In our algorithm, the network maintains two types of queues. The first 
type, referred to as type-I, will be denoted by q, and the second type, type-II, 



57 



will be denoted by u. 

Any internal node ri\ maintains a type-II queue u 9 n for each gateway 
g in the same cluster and a type-I queue q™* for each node n<i in the same 
cluster. A gateway g\ maintains a type-II queue u 9 ^ for each of other gateways 
gi in the network (even for gateways in the same cluster). 

For each node n in the same cluster, gateway g maintains both a type-I 
queue and a type-II queue for node n. A mobile m maintains a separate type- 
II queue u 9 m for each gateway g in the network. We use q b a \t] {u b a [r\) to denote 
the length of the type-I (type-II) queue maintained by node a for node b at 
the beginning of the time slot t (super time slot r). Note that = and 
q™ = at all times V n. 



Part I: Choose gateways (eq. 3.1) 



Part II: Transfer 
to type I queue 
(eq. 3.2) 



Part VI: Tx/Rx with mobile (eqs. 3.6 & 3.7) 



Part IV: Transfer to 




Part V: BP inside 




cluster with type I 






^queues ^ 




"""src 




Figure 3.3: In this figure, we break down the BP+SR algorithm into six parts. 
Part I is the source routing. Parts V and VI are the intra- and inter-cluster 
back-pressure routing algorithms, respectively. The two back-pressure algo- 
rithms interact via packet transfers between type I and II queues. Part III 
is for the load balancing over different gateways in the same cluster using 
the intra-cluster back-pressure, i.e., the gateways use the intra-cluster back- 
pressure as the back haul link for transferring packets between them. 



58 



3.5.2 BP+SR Algorithm 

We now present our two-scale BP routing algorithm, BP+SR (Source 
Routing). We give a high-level highlight using Figure 3.3 before discussing 
the algorithm in details. First, an inter-cluster traffic source takes a group 
of packets and chooses the optimal gateways (in both source and destination 
clusters) through which these packets should be routed (Part I). The optimal 
gateways chosen can change over time. These packets are routed from the 
source to the chosen optimal source gateway, and from the chosen optimal 
destination gateway to the destination using the intra-cluster back-pressure 
inside the respective clusters (Part V); they are routed from the source gate- 
way to the destination gateway using the inter-cluster back-pressure (Parts III 
and VI). The interaction between the two back-pressure routing algorithms 
happens through packets transfers between the two types of queues (Parts II, 
III, and IV). Part VII is presented for mathematical convenience and is not in 
the actual protocol. 

Part I: Selecting source and destination gateways 

At the beginning of super time slot r, the inter-cluster traffic source s 
picks the source and destination gateways g^r] and g^[r] such that 

(rflTUSM) G arg min (u*[ T ] + ««[r] + <[r]) . (3.1) 

g s fcJT.e(s) 

This route selection is done at the beginning of each super time slot (i.e., every 
T time slots). The source s makes the routing decision (Eq. 3.1) independently 
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of other inter- and intra-cluster sources. The source and destination gateways 
are chosen such that the pair minimizes the total queue lengths considering 
the intra-cluster path from the source to its gateway, the inter-cluster path 
between the two gateways, and the intra-cluster path between the destination 
gateway and the destination node. 

Part II: Traffic control at the source nodes 

• For an inter-cluster flow [s,d\, the source node s deposits newly arrived 
packets into queue u 9 s s ^ during time slot t G [tT, (r + 1)T). The identities 
of the source gateway g*[r] and the destination gateway g d [r] are recorded 
in the headers of the packets. 

• For an intra-cluster flow [s, d], the source node s deposits the new arrived 
packets into queue . 

• Define 9 9s [r] = u 9 s s [t]/K s , where K s = T/\Q(s)\. is the number of nodes 
in cluster C.) Consider the queues associated with gateway g s G Jte(s)- If 

0f[r]>gf[t] (3.2) 

at time t G [rT, (r + 1)T), rj packets are transferred from queue u 9a to queue 
q 9a at the beginning of time slot t. (r) is some positive value greater than 
the largest transmission rate out of any node inside a cluster.) 

When a packet arrives at the source gateway g*, the source gateway would 
set the next destination of that packet to g* d , which it would find in the 
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packet header. (See Eq. (3.1).) The gateway would then insert that packet 
into the queue u g i. 

To achieve throughput optimality, an inter-cluster traffic source must 
do source routing as in Eq. (3.1), and when it does source routing, the length 
of the queue u 9s will become of order O(T), and there must be a way to release 
the packets stored in the queue u 9a into the cluster so they can reach the source 
gateway. The purpose of Eq. (3.2) is exactly that - to release/transfer the 
packets from u 9a (type-I queue) to q 9s (type-II queue) at a controlled and 
acceptable rate to the cluster. The factor |C(s)| is needed so as to prevent the 
inter-cluster end-to-end delay from scaling with the cluster size. 

Part III: Traffic control at the gateway nodes 

The gateway g x computes l gi , g2 [r] E arg max^/^ K \ (u l gi [r] -u l g2 [r)) at 
the beginning of each super time slot for each gateway g 2 i n the same cluster. 
Define 0£[r] = {ufc n[r \r] - u£' nlr] [r]} /K gi where K 9l = T/\G( gi )\. At 

l frl 

each time slot t 6 [rT, (r + 1)T), g\ transfers r\ packets from it//' 92 to q 9 g \ if 

(3-3) 

The next destination of the transferred packets is temporarily set to g 2 ] when 
g 2 receives those packets, they are inserted into u 9 g %, where g* d (from Eq. (3.1)) 
can be found in the packet headers. 

Part III is used by the gateways (that are in the same cluster) to balance 
the load amongst themselves using the intra-cluster resources. That is, the 
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gateways can us any available bandwidth in the cluster to shift load from one 
gateway to another. 

Part IV: Traffic control at the destination gateways 

When the packets arrive at their destination gateways, they are de- 
posited into queue u d . Let Og d \r] = Ug [r]/ K 9d , where K gd = T/\Q(g d )\. In 
each time slot t G [tT, (r + 1)T), t] packets are transferred from u gd to q gd if 

<£,M>&M- (3-4) 

Part V: Routing and scheduling within a cluster 

In each time slot t, each cluster 6 computes j2e[t] such that 



where P( m , n )[*] = qm m,n)lt] [t] -q# m,n)M [t] and j( m , n )[t} = argmax^- {qjjt] - q J n [t}}. 



After the computation, node m transmits /J>( m ,n) \P] packets out of queue j( m ,n) [t] 
to node n in time slot t. Tq is the set of all feasible rate in the cluster C. 

Part VI: Routing between gateways and mobiles 

At the beginning of each super time slot r, the mobile m and the 
gateway g that are in contact compute the following: 




(3.5) 



3{m,g) [ T . 



J(9,m)[ T . 



arg max 

j gateway 



arg max 

j gateway 



[vPJt] -u 3 g [r]}, 
{ui[r} -<Jt]} . 



(3.6) 



(3.7) 
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Afterwards, m transmits R packets from the queue Um' g) to g, and the 
gateway g transmits R packets from the queue Ug (9 ' m ^ r ' to m, maximizing 

Part VII: Real and regulated queues 

For analytical purposes, we assume that each type-II queue at a gateway 
consists of two parts: a real type-II queue, denoted u, and a regulated type-II 
queue, denoted u. At gateway gi, the amount of packets transferred from 
the real queue u 9 g \ to the regulated queue u 9 g \ in time slot t is y^[t] = (1 + 
^) Y^i[ s djG^intor s&e( ) x sfsi M wnere 92 is another gateway, and the amount of 
packets transferred from u d gi to u d gi is y d gi [t] = (1 + 5) £ Me:r . ntcri d & e(ai) ^ffi 
where d is an internal node in C(gi) and 5 > 0, and 

T g d ,dM = / x t [i 9s = g* s [r) and g d = g* d [r] 
S >9* [1 \ else 

is the amount of inter-cluster traffic [s, d] that is assigned to the gateways 
g*[r] E K e{s) and g* d [r] E JC e (d) in time slot t E [rT, (r + 1)T). Though 
regulated queues are not used in practice, they are needed to prove the stability 
and throughput optimality of our algorithm. 

Remark: In our two-scale BP+SR algorithm with queue reduction, the 
source nodes need to know the queue-lengths of all gateways in the source 
cluster and destination clusters; this is needed to reduce the number of queues 
maintained at the other internal nodes in the same cluster. We note that it 
would be difficult to obtain this information instantaneously as required in 
the algorithm. However, similar to the cluster-based back-pressure proposed 
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in [82], we can use the delayed queue length information, and the algorithm 
is also throughput-optimal. (The analysis of our algorithm would, however, 
have to account for the presence of the two time-scales.) We skip this because 
similar analysis is provided in [82]. 

3.5.3 Throughput Optimality 

We now prove that our BP+SR is throughput optimal. 

Theorem 2. Fix any 5 > 0. Given external arrival x such that (1 + 5 + e)x 
is supportable for some e > (i.e., there exists an algorithm that can stabilize 
the network with traffic load (1 + 5 + e)x), all queues are bounded under the 
BP+SR algorithm. 

Sketch of the proof: We first bound the lengths of type-II queues when the 
routing algorithms (Eq. (3.1), (3.2), (3.3), and (3.4)) are updated every T 
super time slots; we will refer T super time slots as a super-super time slot. 
We assume that T is large enough so that for any mobile m and any gateway 
g that m comes into contact with, m makes at least (1 + e) _1 (7r m ) ff T contacts 
with g over T super time slots. Let f denote f-th super-super time slot. Then, 
we will use this bound to obtain the upper bound when routing algorithms are 
updated every super time slot. 

We define a Lyapunov function VfrT] = J2 n ^2j( u n['rT}) 2 , and let 
A f V[ff] = V[(f + l)f]-V[rf}. 
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Using our BP+SR algorithms (3.1), (3.2), (3.3), (3.4), (3.5), (3.6), and (3.7), 
and the fact that (1 + 5 + e)x is supportable, we can show that AyV[fT] < —6 
if M^[fT] > U max for some ni, n 2 and for some U max , from which we can show 
that V[fT] < {K) 2 and uiJ^pfT] < K, V n\, n 2 , where K is some positive 
constant. 

The probability of the mobile not exhibiting the stationary distribution 
in T super time slots is exponentially decreasing in T. Thus, we can obtain an 
expected upper bound on the type-II queues when our algorithms (3.1), (3.2), 
(3.3), and (3.4) are updated every super time slot. We can then use Theorem 
1 of [50] to bound the type-I queues since in eqs. (3.2), (3.3) and (3.4), type-II 
queue sizes are used as a linear utility function. Each super time slot is long 
enough so that within each cluster, any utility-based back-pressure rate control 
algorithm (that is updated every time slot) using the type-I queues converges. 
For the complete proof, see the Appendix of this chapter. ■ 

3.5.4 Buffer Usage & Delay 

Consider a line network with two gateways and a mobile, as shown in 
Figure 3.1 (without the intra-cluster traffic). Let there be an inter-cluster flow 
[s, d] (with s being 1.100 and d being 2.100 in Fig. 3.1). Let g s and g& be the 
source and destination gateways, respectively (g s is 1.104 and gj is 2.103). The 
source cluster size is N c , so that the number of hops from s to g s is N c — 1. 
Assume that the mobile m (0.100 in the figure) shuffles between g s and gd 
every T time slots, so that two consecutive contacts at a gateway are IT time 
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slots apart; the mobile makes contact with the source gateway in time slots 
2T, 4T, 6T, ... The system starts at time slot 0. Let s generate traffic for 
d at the average rate of 1 — 7, 1>7>0 packet per time slot (in each time 
slot, s generates one packet with probability 1 — 7 and generates nothing with 
probability 7). Assume that each node % can transmit one packet to % — 1 and 
receive one packet from % + 1 simultaneously per time slot and that all links 
are directed. We assume directed links only for the purpose of analysis in 
the subsection, and this restriction is placed to prevent packet looping in our 
analysis. Assume that the mobile and the gateway can exchange 2T packets 
per contact. We will use % to denote the nodes in the source cluster, with % — 1 
being g s , i = 2 being the immediate neighbor of g s , i — 3 being the node two 
hops from g s , and so on. 

Assumption 3. For the purpose of the following lemma, we assume that T 
is large enough so that the number of packets generated by the source over 2T 
time slots is between 2T(1 — 7 — e) and 2T(1 — 7 + e), for a sufficiently small 
e. In addition, we assume that packets are transmitted out of the queues at the 
beginning of a time slot, and inserted into the queues in the middle of a time 
slot, and that the source generates packets at the end of a time slot. 

We will use q, q, q to denote the queue size at the beginning, in the 
middle, and at the end of a time slot, respectively. We will also use G n to 
denote the total number of packets the source generates from time slots 2nT 
to 2{n + l)T - 1. By Assumption 3, 2T(1 - 7 - e) < G n < 2T(1 - 7 + e). 
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The assumption that the number of packets the source generates is 
between 2T(1 — 7 — e) and 2T(1 — 7 + e) over IT time slots can be justified 
using the Chernoff bound: assume i.i.d. Bernoulli arrivals, and for any e > 0, 
the probability that the source generates more than 2T(1 — 7 + e) packets in 
2T time slots is less than or equal to 

exp(-2rxD((l-7 + e)||(l-7)), 

and the probability that the source generates fewer than 2T(1 — 7 — e) packets 
in 2T time slots is less than or equal to 

exp(-2rx£>((l-7-e)||(l- 7 )), 

where 

D(x\\y) = a; log ( — J + (1 — x) log 

\yj 

When T is sufficiently large, both probabilities are very small. 

Lemma 2. Suppose all queues are initially empty. Under BP, there exists a 
time slot such that all packets the source generates after would experience 
a delay of at least (N c — 1)(2T(1 — 7 — e) — 1) before being picked up by the 
mobile, where e > so that Assumption 3 holds. Under BP+SR, the delay 
any packet would experience before being picked up by the mobile is at most 
Nl + 3T. 

Proof: Under traditional BP: Let {n\, ri2, ...} be an infinite sequence of positive 
integers such that n\ 7^ n\> if I ^ V and at the beginning of time slots 2n{T 
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(just before the mobile picks up packets), 

<2?[2n,T] >2T(l- 7 ). (3.8) 

Such a sequence exists because the mobile has to transport at least 2T(1 — 7) 
packets infinitely often in order to support the source rate of 1 — 7 packets per 
time slot. We set td = 2n{T. 

Sketch of Proof: We first show that new packets generated in time slots 2niT, 
2n;T + l, ... , 2 (n/ + l)T — 1 would experience delay of at least (iV c — 1)(2T(1 — 
7 — e) — 1) before reaching the source gateway because qf[2niT] > 2T(1 — 7) — 1 
for i = 2, N c . To show this, we need to show that qf [2n/T] > 2T(1 — 7) — 1, 
i = 2, N c if qf [2n{T\ > 2T(1 - 7) (see Eq. (3.13)). 

We then show that any new packets generated in time slots 2(ni + 1)T, 
2(rii + 1)T + 1, ... , 2{ni + 2)T — 1 would also experience delay of at least 
{N c - 1)(2T(1 - 7 - e) - 1). For this, we show that 

n c 

qf [2(n, + 1)T] > (N c - 1)(2T(1 - 7 - e) - 1) 

i=2 

which requires the fact that qf[2n{T] > 2T(1 — 7) — 1, i — 1, ...,N C . See Eq. 
(3.18). 

We finally show that for k > 1 such that rii + k < ni + i, any new packets 
generated in time slots 2{ni + k)T, 2(ni + k)T + 1, ... , 2(n/ + k)T — 1 would 
also experience delay of at least (N c — 1)(2T(1 — 7 — e) — 1) by showing that 

N c 

fi [2(m + k)T] > (N c - 1)(2T(1 - 7 - e) - 1) 

i=2 
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which requires that Yh= 2 Qf[ 2 ( n i + k ~ !) T ] > ( N c ~ 1)(2T(1 - 7 - e)). See 
Eq. (3.20). 

Details of Proof: For any i = 1, Nc — 1 and at the beginning of any time 
slot t, if qf[t] = q + 1, where q > 0, then 

3Z i)t >0s.t. qf +1 [t-l ht }>q + l. (3.9) 

We prove Eq. (3.9) by contradiction. If for all l^t > such that 
qf+i[t — h,t] < q + 1, then qf could not have increased to q + 1 since node 
i+1 would transmit a packet to node i in some time slot r if and only if 
9i+iH ~ Qi[ T ] > an d node z + 1 is the only node that transmits to node z; 
since qf +1 [t — /j )f ] < g + 1 for all the maximum that g,f [£] can be is g, which 
is a contradiction. There can be multiple values for U tt for qf +1 [t — li jt ] > q + 1 
to hold; we let take on the smallest value > so that qf +1 [t — l^t] > q+ 1. 

We now show that if gf [t] = g + 1, then gf +1 [t'] > q, t > t' > t — k jt by 
contradiction. Suppose there is t' such that t > f > t — k tt and gf +1 [t'] < g. 
Since qf +1 [t — k t t] > q + 1, this implies that qf +1 [t"} = g in some time slot 
t" between t — Zj jt and t', and transmitted a packet to node z without having 
received a packet from node i + 2 (or generated a packet); this is the only way 
node i + 1 could decrease its queue length from q to q — 1. Since a packet has 
been transmitted from node i + 1 to node i in time slot t", gf +1 [t"] — qf [t"] > 0, 
which implies that qf[t"] < q — 1. Since gf has to increase to g + 1 by time £, 
there is a time slot if" between t" and t so that qf +1 [t"'] > q + 1 (otherwise, 
gf[t] could not reach g + 1), which violates the assumption that the smallest 
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value for l i)t is chosen so that qf +1 [t — l^t] > q + 1 holds. Thus, 

qt +1 W]>q,t>lf>t-l i>t . (3.10) 

Note that in addition, if qf[t], <?f +1 [£] > q, then as long as 

qf[t\ >q for t>t, qf +1 [t\ > q (3.11) 

since node i + 1 will not transmit to node z if <?.f + i[t] — [t] < 0. 
By eqs. (3.8) and (3.9), there is h,2niT such that 

~q d \2n{F - l 2MlT ] > 2T(1 - 7 ). (3.12) 

By Eq. (3.10), q d 2 [2 ni T] > 2T(1 - 7 ) - 1. 

By eqs. (3.12) and (3.9), there is h,2niT-i 2 2n t sucn qi[^ n iT — 
k,2n lT - h,2 ni T-i 2 , 2niT ] > 2T(1 - 7 ). By Eq. (3.10), ^[2n,T - / 2 , 2n;T ] > 2T(1 - 
7) - 1. By eqs. (3.12) and (3.11), q* [2n,T] > 2T(1 - 7) - 1. 

Similar to the way we showed that g|[2n«T] > 2T(1 - 7) - 1, we can 
show that qf[2mT}, $*[2n z T], ... , q% c [2niT] are also > 2T{1 -7) - 1. Thus, 

gf[2n,T] > 2T(1 - 7) - 1 (3.13) 

for i = 2, N c if [2n;T] > 2T(1 —7). Thus, any packet the source generates 
in time slots 2n;T, 2n{T + 1, ... , 2(r^ + 1)T — 1 would see a delay of at least 
(-/V c — 1)(2T(1 — 7) — 1) before reaching the source gateway. 

Now we show by contradiction that for any / > 0, 

g?[2(n, + l)T] < L {ni+l) (3.14) 
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where 

L 



\ fe $[ZniT] + G ni + max{gf [2n*T] - 2T, 0} j (3.15) 



and max{gf [2n;T] — 2T, 0} is the number of packets remaining in gf after the 
mobile picked up packets from the source gateway in time slot 2n{I '. 

Suppose qf[2(ni + 1)T] > £( n ,+i) + 1 for some I. Then, similar to the 
way we showed Eq. (3.13), we can show that qf[2(jii + 1)T] > L( ni +i) f° r 
2 = 2,..., N c . Then, if ^[2(n, + 1)T] > L (n;+1) + 1 then ££i gf [2^ + 1)T] > 
A^ c L( n;+1 ) + 1 at the very beginning of the time slot 2(ni + 1)T, which is 
impossible because at the very end of time slot 2(n/ + 1)T — 1, 

^gf[2(n, + l)T-l] = l^tf[2n l T]+G ni +max.{fl[2n l T\-2T,0} 

1=1 \ 8=2 

= N c L (ni+1) . (3.16) 

Eq. (3.16) should equal Ylf=i Qi[2{ n i + 1)^1 because by Assumption 3, the 
number of packets in the queues at the very end of a time slot is the same as 
at the very beginning of the next time slot. Hence, qf[2(ni + 1)T] < L( n;+1 ), 
and by subtracting qf[2(ni + 1)T] from LHS and Lfa+i) from RHS of Eq. 
(3.16), 

N c 

J2qf[2(n l + 1)T] > (N c -l)L (ni+1) . (3.17) 
(Note that g?[2(m + 1)T] = gf [2(m + 1)T - 1].) 
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By definition Eq. (3.15), 



(m+i) ^ 



1 / 



\i=2 



> 



> 



i- ((iV c - 1)(2T(1 - 7) - 1) + 2T(1 -7-6)) 
2T(1 - 7-e) - 1 



where the second Ineq. holds by Eq. (3.13) and by Assumption 3. Combining 
this with Eq. (3.17), 



Any packet that the source generates in time slots 2(ni + 1)T, 2(n; + 
1)T + 1, ... , 2(nj + 2)T - 1 would see a delay of at least (iV c - 1)(2T(1 - 
7 — e) — 1) before reaching the source gateway. This is because the rate of the 
link from node 2 to node 1 is one packet per time slot and there are at least 
(N c — 1)(2T(1 — 7 — e) — 1) packets before the newly generated packets. 

For all k > 1 such that n\ + k < n i+1 , we can show that 



^20[2{m + l)T\ > (iV c -l)(2T(l- 7 -e)-l). 



(3.18) 



i=2 




max 



{qf[2(n l + k-l)T]-2T,0}) 



from which we can show that 



5^[2(n z + fc)T]>(Ar c -l)L. 



'ni+k- 



(3.19) 



i=2 
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(Replace ni + 1 and ri\ with ni + k and + k — 1, respectively, in eqs. (3.14) 
through (3.17) and follow the same reasoning.) From there, we can further 
show that 



For example, L ni+2 > 2T(1 — 7 — e) — 1 because of Eq. (3.17), which implies 
that 



by plugging in the value for L ni+2 into Eq. (3.19). Because of Eq. (3.20), 



L ni+3 > 2T(1 - 7 - e) - 1, which then implies Ya=2 vt[ 2 ( n i + 3 ) T ] > ( N c ~ 



1)(2T(1 — 7 — e) — 1). Following the same reasoning, we can show that 
T^=2^\ 2 ( n i + k ) T ] > (A^ c -l)(2T(l-7-e)-l) for k such that n x + k < n l+1 . 

Thus, any packet the source generates in time slots 2(rii + k)T, 2(rii + 
k)T + 1, ... , 2(m + k + 1)T - 1 would see a delay of at least (N c - 1)(2T(1 - 
7 — e) — 1) before reaching the source gateway. 

For qf to reach at least 2T(1 — 7) in time slot 2nT, q$ must have been 
at least 2T(1 — 7) at some point in time before 2nT, or otherwise qf could not 
reach 2T(1 — 7). Since no packet can be transmitted out of q% into q% (because 
the links are directed from node i + 1 to node i), q^nT] > 2T(1 — 7) — 1. 




> ^ ((Nc - 1)(2T(1 - 7 - e) - 1) + 2T(1 - 7 - e)) 

> 2T(1 - 7 - e) - 1. 



E ^(ni + 2)T] > (N c - 1)(2T(1 - 7 - e) - 1) 



(3.20) 



i=2 
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Likewise, since q$ must reach at least 2T(1 — 7), q$ must have been at least 
2T(1 — 7) at some point in time before when q$ reaches 2T(1 — 7). Thus, 
q^[2nT] > 2T(1 — 7) — 1, and using the same reasoning, qf[2nT] > 2T(1— 7) — 1, 
z = 2,...,JV c . 

Since the net change in q9\ in each time slot is zero, q%[t] > 2T(1 — 7) — 1 
V t > 0. Because the link between nodes 3 and 4 is directed, q^[t] > 2T(1 — 
7) - 1 V t > 0. Using the same logic, g?[t] > 2T(1 - 7) - 1 V t > and V i > 2. 

Under BP+SR: Note that qf a = q 9a s = V t. Whenever qf [t] = 1 for some 
time slot t, in the next time slot, one packet will be transmitted out of gf s to 
qf 2 . Since at most one packet can be transmitted into gf 2 each time slot, we 
have ?| s [t] < 2 V t. Likewise, whenever q^ s [t] = 2 for some time slot t, in the 
next time slot, one packet will be transmitted from g| s to gf s , and since at 
most one packet can be transmitted to gf", we have gf s [t] < 3 V t. Continuing 
with this logic, we have qf s [t] < i V t. 

In each time slot t, if u 9 /[t] > K s q 9 /[t] (with K s = T/N c ), then rj = 1 
packet will be transferred from u 9 / to q 9s . Since q 9s [t] < N c V t, we have 
u 9 /[t] < K a N c = TV t. 

Note that the length of u 9d s is upper bounded by 2T, since only 2T 
packets can accumulate at g s before the mobile comes and picks up the 2T 
packets. Thus, there is at most N% + 3T packets waiting to be transported 
by the mobile ahead of any newly generated packet. Thus, because qf" = 
Vt, one packet will be transmitted over the link between node 2 and node 1 if 
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there are any packets in the cluster. ■ 

Considering a line network as we have done here is the same as fixing the 
route the inter-cluster flow takes to reach the gateways. Though our analysis 
here is for a line network, we believe the claim holds in two dimensional clusters 
where the hop count to reach the gateways increases as the cluster size increases 
since the proof above essentially shows that what matters is the hop count to 
the gateway, not the size of the cluster. 

If there are multiple inter-cluster flows, the characteristic of the delay 
that one would observe under BP+SR and the traditional BP would remain 
unchanged. That is, under BP+SR, the delay to reach the mobile transport 
would be linear in N% and T, whereas under the tradition BP, the delay would 
be linear in the product N C T. This is because under BP, each internal node 
would have to match the time scale of the intermittent connectivity by hav- 
ing large queues (on the order of 9(T)), whereas under BP+SR, this is not 
necessary. 

Since we assume that the internal nodes all use FIFO queues, the end- 
to-end delay under BP scales linearly with N C T, which is what we observed in 
Figure 3.2(b). 

One last case we consider is when we have multiple mobiles with dif- 
ferent shuttling times; for example, in the line network we have one mobile 
that shuttles between the two clusters every 100 time slots and another mo- 
bile that takes a million time slots to do the same. In such cases, the average 
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inter-cluster delay will be on the order of the longer shuttling time. 

3.6 Two-Scale BP with Rate Control 

When the back-pressure rate control algorithm is implemented on ICN, 
the previously defined queuing architecture needs to modified slightly. The 
main reason for this is that the back-pressure rate control algorithm uses local 
queue lengths to detect congestion. The length of queue u 9a (see Eq. (3.1)) 
maintained at an inter-cluster source s would only tell about any congestion 
between s and g s . Thus, the length of u 9s is useless to measure the level of 
congestion between the source and the ultimate destination. Thus, in our 
two-scale BP with rate control, we eliminate type-I queues for gateways, and 
instead have queues directly to the destinations. 

3.6.1 Queuing Architecture 

The queuing architecture for internal nodes is the same as under the 
traditional back-pressure algorithm, i.e., an internal node i maintains a queue 
for each destination that it receives a packet for - if % receives a packet destined 
for n, it will create and maintain the queue qf . Each gateway node g does the 
same, and in addition for each inter- cluster destination d in the network, it 
will create and maintain the queue q d g . 

To understand how the queuing architecture works for our two-scale 
BP with rate control, consider a cluster C and a gateway g in C. Suppose an 
inter-cluster flow [s, d] originates from C. The gateway g maintains q d g and q d . 
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All internal nodes in C maintain a queue for d, and the internal nodes in the 
neighborhood of g transmit packets for d into g's q d . Once the packets for d 
arrives at g, they are immediately placed into q d (bypassing q d ). The gateway 
g does not advertise the queue length of q d internally; instead, it advertises 
q d /T as the queue length of q d . (Note q d is empty at all times because any 
packet that comes into q d is immediately taken out and placed into q d . Though 
it is empty, the length of q d is advertised to the internal nodes as q g /T .) Each 
internal node % in the cluster advertises qf without any scaling, and the back- 
pressure algorithm operates within the cluster based on the advertised queue 
lengths of q d V n £ C. 

For each inter-cluster destination d, each mobile m maintains q^. When 
a gateway comes in contact with a mobile, they use and q d to compute the 
back-pressure between them. (In BP+SR, the type-II queues at the gateways 
are for destination gateways. Here, q d is a queue that the gateway g maintains 
for the actual inter-cluster traffic destination d.) 

Once the inter-cluster flow packet reaches a gateway gd in the destina- 
tion cluster, it is placed into q d d . g^ maintains two queues q d d and q d d for d 
as mentioned before. In each time slot t, gd transfers 77 (77 << R, where R is 
the number of packets transferred between mobiles and gateways on contact) 
packets from q d d to q d d if and only if 

> m ( 3 - 21 ) 
Once put into q d the packets are routed to the destination using back-pressure 
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routing in the destination cluster. 

To reduce the number of queues to be maintained by each node in our 
implementation, a gateway that receives packets destined for a different cluster 
(i.e., is a way-point gateway), does not send out the inter-cluster packets to 
the internal nodes within its cluster. This way, an internal node only has to 
maintain a queue for every other node in the same cluster as itself and only 
for the other nodes in different clusters that are destinations of inter-cluster 
traffics originating from the same cluster. 

3.6.2 Impact of T Estimation 

Our two-scale BP with rate control algorithm requires the knowledge of 
the time scale difference between the intra-cluster wireless packet transmission 
and the mobility. But in fact, even a rough estimate (anything 0(T)) of the 
difference is good enough, and the throughput optimality would still hold. 
The best scaling factor T would be the ratio of the time duration it takes 
the mobiles to make two contacts to the intra-cluster time slot; however, this 
is difficult to measure precisely. If too large of an estimate is used for T, it 
would result in longer queues at the gateways and a longer time before the 
inter-cluster rates converge for two-scale BP with rate control; for BP+SR, 
the time it takes for the source routing to converge would be increased as 
well. If too small of an estimate is used, this would result in fluctuations in 
the instantaneous inter- and intra-cluster rates for two-scale BP. This rate 
fluctuations can be seen in Figures 3.9(a), 3.9(c), and 3.9(b). For BP+SR, too 
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small of an estimate of T would result in an increase in delay; for example, 
if T = 1 is used, this would result in end-to-end delay that scales with the 
cluster size, as depicted in Figure 3.2(b). 

In networks with multiple mobiles with different values of T, the through- 
put optimality property of BP+SR and two-scale BP with rate control will still 
hold. In such case, our algorithms (without any further modifications) can use 
the estimate of the largest T, and the delay performance result of BP+SR in 
Section 3.5.4 would have to be adjusted so that it scales with the largest T. 
However, the fact that only the nodes involved in inter-cluster traffic (gate- 
ways, mobiles, and the inter-cluster source) have to maintain large queues still 
holds in networks with multiple mobiles with different T. 

3.6.3 Rate Controlled BP 

Utility maximization has been addressed in a back-pressure framework 
[2,23,48,50,68] to address rate control issues. 

We use the formulation in [2] to describe the idea here. Each flow [s, d] 
(whether inter- or intra-cluster) has a utility function U\ SjC n(x d ), which is a 
function of the rate x d s it is served at. We assume that all utility functions 
are strictly concave with continuous derivatives. The utility maximization 
problem is the following: 



Let x d [t] denote the rate at which the flow / is served in time slot t. The rate 



max 

xGA 




(3.22) 
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control algorithm that maximizes (3.22) is the following. In each time slot t, 
the source s injects k > packets into the queue q d if and only if 

U{ s4] (x d s [t])-Pq d s [t]>0, (3.23) 

where /3 > is a control parameter and U! s * is the first derivative of flow 
[s, rf]'s utility function. The parameter (3 controls how close to the optimal 
rate allocation the system performs, but this comes at the price of longer 
queues. We also refer to [2] for a discussion on this implementation. 

3.6.4 Comparison of Two-Scale BP Algorithms 

The two-scale BP algorithms we presented in Sections 3.5 and 3.6 both 
separate the time scale of the intermittent connectivity from the time scale of 
the internal connections. Under BP+SR, the rates at which sources generate 
data are fixed; under two-scale BP with rate control, the source rates are 
dynamically adjusted to maximize the sum utility. In addition, under BP+SR 
the internal nodes in the source cluster maintain queues for the gateways, 
instead of inter-cluster destinations, which allows the inter-cluster source to 
perform source routing. This, however, cannot be done when rate control is 
implemented because the inter-cluster congestion signal cannot be passed back 
to the inter-cluster flow source; when the inter-cluster source maintains queues 
for gateways and not for inter-cluster destinations, the large gateway queues 
can signal congestion for gateways and not for the inter-cluster destination. 



SO 



Cluster 1 




Om(1) 




0(5) (3) = inter-cluster traffic src/dest pairs 
|Q = gateways/mobiles 0= internal src/sink 



Figure 3.4: 3x4 ICN used in simulations for Section 3.7.1. We also simulate 
with cluster size 6x4. The pair of nodes labeled 1 communicate with each 
other, likewise for pairs 2 and 3. Under the traditional BP, the end-to-end 
inter-cluster delay increases (by approximately twice) as the cluster size in- 
creased from 3 x 4 to 6 x 4 (see Figure 3.5); however, the delay is invariant to 
the increase in cluster size under BP+SR. Internal nodes in the clusters main- 
tained large queues to accommodate the inter-cluster flows under BP; under 
BP+SR, they did not. 
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CDF of end-to-end delay 




d 



Figure 3.5: CDF of end-to-end delay for inter-cluster traffic 

3.7 Experimental Results 

We present two sets of experimental results; one for two-scale BP with 
queue reduction (BP+SR) and another with rate control. The gain of the 
BP+SR compared to traditional BP is proportional to the product of the 
network size and the time scale of the intermittent connections. In small size 
networks, if the time scale of intermittent connection is large, we will also see 
a significant gain of the BP+SR. 

Because of the limited size of our wireless testbed, we instead simulate 
a large network for BP+SR. 
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Figure 3.6: Max (type I) queue length at internal node 

3.7.1 Two-Scale BP with Queue Reduction: BP+SR 

We consider two networks. The first network consists of three clusters, 
with each cluster composed of 12 nodes (3x4); the second network consists 
of three clusters, with each cluster composed of 24 nodes (6x4). Figure 3.4 
is the first network we simulate. There are three pairs of inter-cluster traffic 
sources/sinks, labeled with 1, 2 and 3, i.e. the two nodes labeled with 1 
communicate with each other and likewise for 2's and 3's. The inter-cluster 
traffic sources generate data at a rate of 0.4 pkts/time slot. Mobile m(l) comes 
into contact with gateways g(i, 1), % = 1, 2, 3 only; similarly for m(2). T is set 
to 1000. In the simulation, we have randomly generated intra-cluster traffic, 
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Figure 3.7: Max (type II) queue length at inter-cluster traffic nodes 

such that the combined intra- and inter-cluster traffic utilizes all internal links 
at 90% of their capacity. Each internal link has a capacity of 1 pkt/time slot. 
rj is set to 10. (If t] is too large, then the queue lengths would fluctuate by 
large amounts. If r\ is too small, then the transfer algorithms Eq. (3.2), (3.3), 
and (3.4) would need to be executed often, increasing the processing burden.) 

The probability that mobile m(l) goes to gateway g((i + 1 mod 3), 1) 
when it is at gateway g(i, 1) is 0.8 (gateway g(i, 1) belongs to cluster i); the 
probabilities that it stays at g(i, 1) or goes to g((i — 1 mod 3), 1) both equal 
to 0.1. The probability that mobile m(2) goes to g((i — 1 mod 3), 2) is 0.8; the 
probabilities that it stay or goes to g((i + 1 mod 3), 2) both equal to 0.1. 

The number of packet transferred per contact between a mobile and a 
gateway is 1500 pkts/contact (R = 1500 pkts/contact). 
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We compare our BP+SR to the traditional BP, as the traditional BP 
(and its variants) is the only other throughput optimal routing algorithm. Fig- 
ure 3.6 shows the evolution of the longest queue under BP and the evolution 
of the longest type-I queue under BP+SR. We can see that the longest type I 
queue is substantially smaller than the longest queue under BP. In our simu- 
lations, we found that under BP each node in the network had six queues of 
order 0(T), corresponding to the six different inter-cluster traffic destinations. 
Under BP+SR, each inter-cluster source had two queues of order 0(T), cor- 
responding to the two gateways in each cluster. The other nodes with 0(T) 
queues were gateways and mobiles. 

In Figure 3.5, we show the CDF of the total end-to-end packet delays 
for inter-cluster traffic. The average end-to-end delay under BP+SR is roughly 
28,000 time slots in both 3x4 and 6x4 cases; the delay under BP is 38,000 
time slots in 3 x 4 case and 78,000 in 6 x 4 case. As discussed in Section 
3.5.4, the delay doubled under BP as the cluster size doubled. The delays 
for intra-cluster flows stayed the same under BP+SR; BP+SR and BP both 
showed short delays for intra-cluster flows. 

3.7.2 Two-Scale BP with Rate Control: Implementation 

We implemented our two-scale BP on our 16-node testbed. Our imple- 
mentation consists of two parts. The first part is the modification of the Mad- 
Wifi wireless device driver for Atheros 5212 to support differentiated levels of 
channel access on a frame-by-frame basis through varying MAC contention pa- 
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rameters such as AIFS and the contention window sizes. The second part is our 
implementation of the modified back-pressure routing algorithm on the Click 
Modular Router [39], which utilizes the modified MadWifi to approximately 
solve the MaxWeight optimization problem (3.5) without global knowledge. 
We describe each part below. 

MAC and PHY: We modified the MadWifi driver so that it supports four 
hardware queues, with each queue having different AIFS, CW max and CW min 
values shown in Table 3.1. (When two wireless transmissions contend for access 
to the same channel, the wireless transmission with smaller MAC parameter 
values will statistically have more access.) Each hardware queue is given a 
priority number ranging from to 3. The modified device driver inspects 
the TOS field of the IP header of a packet, and injects it into the hardware 
queue with the same priority number as the TOS field. If the BP (Click-layer) 
queue difference between the source hop and the destination hop is greater 
than or equal to 25, between 24 and 14, between 13 and 6, and lower than or 
equal to 5, we assign TOS levels 0, 1, 2 and 3, respectively (we represent this 
mapping by the threshold array L = {25, 13,5}). Here, the threshold arrays 
are chosen experimentally. If the thresholds were too close to or too far from 
the optimal, we did not observe desired behaviors, i.e., a pair of nodes with 
a large queue difference should have more access to the channel than another 
pair with a smaller difference. In our implementation, only one transmission 
packet/frame is stored in the hardware queues at a node at any given time. 
Our modification of the MadWifi device driver is very similar to the one in 



86 



[78], with only minor differences. 

Routing and Rate Control: We have implemented the traditional as well as 
the modified back-pressure algorithm (presented in Section 3.6.3) in the Click 
Modular Router. Each packet (with 1KB payload) that is sent out is assigned 
a value between to 3 that is written to the TOS field. The assigned value 
depends on the queue length difference that the wireless transmission source 
has with the next-hop destination. Each node broadcasts a beacon on its 
wireless card every 500msecs. The beacon contains the information about the 
queues the node maintains. The nodes also use the beacons to discover their 
neighbors. All data packets received by a node are acknowledged, and the 
BP ACKs sent to the transmitting node also contain the queue information. 
Thus, ACKs and beacons are used to calculate the queue differences. (IEEE 
802.11 ACKs are transmitted, but they are not used by the back-pressure 
implementation layer.) The transmitting node retransmits a data packet if an 
ACK for that packet is not received within 250msecs. The hop-by-hop ACK 
guarantees that all packets are received correctly by the final destination, and 
the ACKs are also used to throttle the transmission rate. We did not use any 
RTS-CTS in our implementation. 

A source node inspects the queue for its destination every 5msecs. It 
runs the back-pressure rate control algorithm in Eq. (3.23) with k = 3; i.e., it 
generates three packets if (3.23) is positive, and generates no packet other-wise 
(let this variable be NumGen). Then it uses the following update algorithm 
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Priority # 





1 


2 


3 


AIFS 


1 


3 


5 


7 


CWmin 


1 


7 


31 


255 


CWmax 


7 


63 


255 


1023 



Table 3.1: MAC scheduling parameters of the four MadWifi hardware queues 
to estimate the rate x (in Bps): 

packet _size x NumGen 

x = 0.999a; + 0.001- . 

5msecs 

On our MadWifi+Click platform, we implemented the two-scale BP 
with rate control algorithm for intermittently connected networks. We em- 
phasize that the focus of our work is not an implementation of traditional 
back-pressure routing with the modified driver; rather, our aim is to decouple 
the two time scales (mobile-gateway and internal-internal) in such networks. 
In order to do this, we first conducted an experiment with the traditional BP 
on a single time-scale line network. The single time-scale line network consists 
of three nodes 1.101, 1.102, and 1.103 arranged in a line. We have two flows: 
the long flow originates from 1.101 and is destined for 1.103. The short flow 
originates from 1.102 and is destined for 1.103. The utility functions for long 
and short flow are Ki log(xi) and K 2 log(x 2 ), respectively. The rate allocations 
for various values of K\ and K 2 are shown in Figures 3.8(b), 3.8(a), 3.8(c). 
We set L = {25,13,5}. 

The optimal rate allocation is given by the solution to the following 
optimization problem. Let fi and f 2 be the fractions of time that the links 
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Figure 3.8: Rate allocations in 1.101-1.102-1.103 line network. We can control 
which flow gets more rate by controlling the utility function parameters K\ 
and K2. As K\ increases relative to K2, the long flow gets more and more 
rate. 

1.101-1.102 and 1.102-1.103 are active, respectively. Due to the coupled wire- 
less interference constraint, fx + ji ^ 1- The optimal rate allocation can then 
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be obtained by solving 



maximize 



i^logOi) + A' 2 log(x 2 ) 



(3.24) 



subject to 



xx < hC 



xi + x 2 < f 2 C 



assuming that both links 1.101-1.102 and 1.102-1.103 have the same link ca- 
pacity C. (C is the one hop transmission rate between two nodes with no other 
transmissions in the range and was measured to be around 465KBps.) When 
K\ = K 2 = 200, the short and long flow rates we experimentally obtained 
were 200KBps and HOKBps, respectively. When K x = 800, K 2 = 200, the 
short and long rates obtained were 90KBps and 180KBps, respectively. When 
Ki = 400, K 2 = 200, the short and long rates were 145KBps and 150KBps, 
respectively. The experimentally obtained rate allocations are approximately 
identical to the rate allocation obtained by solving the optimization problem 



3.7.3 Verification on a Line Network 
3.7.3.1 Utility Maximizing Performance 

We experimentally verify our algorithm on a simple intermittently con- 
nected line network shown in Figure 3.1 with N c = 2. We used a 100Mbps 
Cisco switch to emulate the mobile-gateway contacts. Each gateway node is 
equipped with one wireless card and an Ethernet port. The gateways use the 
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Figure 3.9: Rate allocation in the network shown in Figure 3.1 (compare 
against optimal rate allocations in Figures 3.8(a), 3.8(b), and 3.8(c)). The 
presence of the intermittent link is hidden both to the inter- and intra-cluster 
sources since they achieve the same rates in the network shown in Figure 3.1 
as in the two-hop 1.101-1.102-1.103 line network. 



wireless card to communicate with the internal nodes and the Ethernet port 
to communicate with the "mobile." 

On each contact, up to 6000 packets can be transferred. Each packet 
has a payload of 1KB, in addition to IP and Ethernet and the modified BP 
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headers (roughly 6MB per contact). The mobile contact node switched clus- 
ters every 10 seconds, so two consecutive contacts at a gateway are 20 sec- 
onds apart. Thus, the average rate (averaged over many contacts) from the 
source cluster (the left cluster) to the destination cluster (the right cluster) is 
300KBps. We also chose T = 6000 and L = {25, 13, 5}. 

The purpose of the modified BP algorithm is to have the inter-cluster 
traffic source be totally unaware of the mobile-gateway contacts, and to disturb 
any intra-cluster traffic as little as possible. We also want to obtain utility 
maximizing rate allocation, even though the clusters are physically separated. 

Let x\ denote the inter-cluster rate, and x 2 denote the intra-cluster rate. 
The utility functions for inter- and intra-cluster traffics are Ui(xi) = Ki log(xi) 
and U 2 {x2) = K 2 log^), respectively. We made sure the only bottleneck is 
the destination cluster. (If the bottlenecks are either the source cluster or the 
mobile-gateway contacts, this could easily be learned by the source.) Thus, 
the optimal rate allocation is the solution to the maximization problem (3.24) 
(with fi and f 2 denoting the fraction of time the wireless channel 2.103-2.101 
and 2.101-2.100 are active, respectively). In summary: 

• Recall that the traditional BP rate controller fails to give an optimal 
rate allocation (unless a large scaling is done, resulting in large queue 
sizes), and resulted in low inter-cluster rates as seen in Figure 3.2(a). 
However, using the modified BP, we get a high, sustained throughput 
for both the inter- and intra-cluster flows, and their rates are (shown 
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Figure 3.10: We can decrease the end-to-end inter-cluster delay first by pre- 
venting packet looping (dashed) and second by using shadow packets. As we 
use more and more shadow packets per data packet, the delay decreases faster. 

Figures 3.9(a), 3.9(c) and 3.9(b)) close to the theoretically computed 
values. The rates are also close to the ones obtained in "Routing and 
Rate Control" portion of section 3.7.2 (shown in Figures 3.8(a), 3.8(b) 
and 3.8(c)). Thus, the modified BP successfully hides the presence of 
the intermittent links. 

• We also verify that large queues occur only at gateways; the queue sizes 
at gateways were ~ 5 x 10 4 . The queues at internal nodes were small, 
roughly ^10. 

3.7.3.2 End-to-End Delay: Shadow Packets 

We compare the end-to-end delay that inter- and intra-cluster packets 
experience. Since inter-cluster packets must pass through the gateways and 
mobiles with large queues, they incur large delays. One factor causing the 
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large inter-cluster packet delay is that some packets are "looping" between the 
gateways and the mobile. ( "Looping" through large FIFO queues a few times 
can increase delays significantly.) When we prevent this looping, we get inter- 
cluster delay shown in Figure 3.10. (This looping is prevented only at gateways 
and mobiles. Looping is prevented by never having the mobile transmit each 
packet back to the gateway from which it was received the packet.) 

Another factor that contributes to the large inter-cluster delays is that 
our utility-maximizing rate controller operates very close to the boundary of 
the capacity region. This is known to require large queues and can thus cause 
long delays. The authors in [14], [31] deal with this problem by introducing 
the notion of shadow packets and queues. Their essential idea here is to trade- 
off throughput for low delays. Our implementation of shadow packets is as 
follows: 

Shadow packet implementation: For every k = 3 packets that the inter-cluster 
source injects into its queue according to Eq. (3.23), it marks one red (or 
shadow). The other two packets are marked blue (data). These shadow packets 
are dummy packets and do not contain any useful data (but still have 1KB 
payload). The blue packets contain real data. Thus, the real, useful rate is 

0. 662. The gateways and mobiles have two FIFO sub-queues for each inter- 
cluster destination. The red and blue packets are separated into these two 
sub-queues. The blue packets get transmission priority over the red packets, 

1. e. shadow packets are transmitted if and only if there are no blue packets 
that can be served. The total size (blue queue size + red queue size) is used 
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for back-pressure routing. 

The end-to-end delay with shadow packets is shown in Figure 3.10. The 
delay is measured only for the data packets. (We also did another experiment 
where we send 1 shadow packet for every 5 data packets on average (the green 
dash-dot curve in Figure 3.10). The two shadow packet experiments were 
done for 60 minutes. When the rate of shadow packet generation is 0.2/per 
data packet, the experiment was not long enough for the delay to converge to 
the minimum, but from the figure, it is clear that the algorithm with shadow 
packets significantly has a significantly smaller delay than that without shadow 
packets.) The delay curve first increases as we first need to build large queues 
at the gateways and mobiles. But as the real packets have priority, only the 
dummy, shadow packets are left behind to hold the steady-state queue sizes 
required for back-pressure to work. The inter-cluster delay decreased from 
roughly 15mins to roughly l-2mins using shadow packets. 

3.8 Experimental Results from Testbed 

We implemented the modified back-pressure algorithm on our 16-node 
testbed. The network we conducted our experiment on is shown in Figure 3.13. 
The nodes in each cluster were placed only a few feet apart. (The clusters were 
on different channels.) Each node in a cluster uses packet filtering based on the 
source IP address; so for example, 1.101 can accept packets only from 1.102 
and is only aware of 1.102's presence. Thus, 1.101 will only transmit to 1.102. 
We are aware that any transmission from, say, 1.101 causes interference on 
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all other transmissions because the wireless range is large enough to cover the 
entire cluster. However, a node can receive only one transmission at a time, and 
a failed transmission from, say, 1.103 (which will also transmit to 1.102 only) to 

1.102 due to the interference caused by a simultaneous transmission from 1.101 
would not have been received by 1.102 anyways even if the nodes were placed 
farther apart. (We are aware that the way we have closely laid out the nodes 
to conduct our experiment does not completely model the network depicted in 
Figure 3.13. For example, as depicted in the figure the nodes 1.103 and 1.101 
are supposedly placed far apart, but they are within each other's transmission 
range. Thus, there can be a collision between transmissions from 1.101 and 

1.103 in the depicted network, but not in our physical network (because of 
CSMA). However, in practice the carrier sensing range of 802.11 is larger than 
the transmission range. Hence, the network we are actually modeling is where 
the nodes 1.101 and 1.103 are out of each other's transmission range, but still 
within the carrier sense range.) Thus, placing the nodes close does not make 
our experimental results less valid. 

Using a 100Mbps switch to emulate the mobile-gateway contacts, up 
to 6000 packets can be transmitted between a mobile and a gateway (so up 
to 12000 packets total) per contact. We picked T = 6000, k = rj = 3, and 
used queue difference levels L = {25, 13, 5}. After the contact is finished, the 
mobiles pick one of the other two gateways randomly, and initiate another 
contact 14 seconds later. We are aware that 14 seconds is not long enough to 
model most mobility in the real world. However, we picked 14 seconds to speed 
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up our experiments so that we can have many contacts within a reasonable 
amount of time. Finally, all flows have the same utility function of 2001og(-) 
in this study 

In summary, we have the following results from our experiment on the 
network in Figure 3.13: 

• We observed that even in this larger network, the intra-cluster queues 
remain very small (between 10-15 pkts). Only the inter-cluster flows 
suffer large delays due to longer queues (between 10 4 and 3 x 10 4 pkts). 

• Furthermore, using our implementation of the shadow queues and pack- 
ets (the idea was proposed by the authors in [14] ; we have developed an 
implementation for intermittently connected networks), we can "back- 
off" from operating on the boundary of the throughput region (i.e., utility 
optimal), and improve the delay performance for the inter-cluster flows; 
the inter-cluster delay decreased from m 20 mins (blue, solid trace in 
Figure 3.12) to « 2 mins using our shadow packets (red, dashed traces 
in Figure 3.12). 

• To get a baseline on the approximate values we should expect, we used 
the fluid deterministic optimization (that ignores ACKs, collisions, re- 
transmissions) to obtain the "optimal" rate allocations to be X\ = 106, 
%2 = 83, X3 = 191, £4 = 146 and £5 = 87 (all KBps). Our experimental 
numbers are X\ = 84, x 2 = 68, x 3 = 120, £4 = 107 and x 5 = 86 (all 
KBps). The rates differ from the fluid approximation anywhere from a 
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Figure 3.11: Rate allocation in the 16-node network. See Figure 3.13 for the 
labels xi, x 2 , x 3 , x 4 and x 5 . x 2 and x 5 are inter-cluster flows. 



few percent to about 40%. One source of the discrepancy is that the 
theoretical framework as in Eq. (3.24) assumes a fluid model with no 
contention loss and exponential back-offs, and does not model the hop- 
by-hop ACKs we used. In practice, there are many packet collisions that 
trigger the exponential back-offs that decrease the "link capacity" in the 
(idealized) fluid model. 
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Figure 3.12: Inter-cluster delays in the network in Figure 3.13 (red, solid=using 
1 shadow packet for every 2 data packets) 
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3.9 Appendix 

3.9.1 More Comments on Overlay Network 

It has been shown in [50] that by using the packet transfer algorithms 
(3.2), (3.3) and (3.4), the rate K b a [r\ (pkts/time slot) at which node a transfers 
packets out of u b * to q b a converges in each super time slot r. (For eqns. (3.2) 
and (3.4), b* is b; for eqn. (3.3), b* is Z a ,b[ r ]-) We assume that each super time 
slot r is long enough for this convergence to take place. (Hence, the rate at 
which a transmits to b converges to fi^[r].) In addition, within each cluster C 

£ **[t]k*[t]+ Yl w^m+EOKm 

is maximized subject to the constraint 

x intra(C) 

is supportable in C. 

We will use k^Jt] (either or R/T) to denote the rate (pkts/time slot) 
at which mobile m transmits to gateway t in super time slot r; likewise for 

Let K c ^ a b j [r] denote the rate at which packets from u c a are transmitted 
from a to b in super time slot r. For example, g ^ [r] is the rate the gateway 
gi transfers to g 2 the packets destined for g d out of queue u 9 *. 
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3.9.2 Notations and definitions needed for the proof of Theorem 2 
3.9.2.1 A<[r] = <[r + l]-<[r] 

We group the inputs and outputs and the exogenous arrivals as follows 
to simplify our analysis of type II queues. For each u l n at node n, we have: 

f E Mg j E 9dGJ f e(d) < d f if n e J e(n) , I e ^e(n) 

4n(n) = \ E[ s ,Z] G jE 9sG M e(s) <1 if n e ^e(0» 1 G J e(/) 

I E ,.,r. , if n e ^e(n), i e !K e(0 , and C(n) ^ C(Z) 



K Z = J Ei6M e(n) UM 4,") if n G ^(n) U M 



v in(n) 



else 



(«4 if n e Je(n),/ G 5C e(n ) 

< if n eftefl,* eJe(i) 

E te JC e(n) U M «(n,i) if n G M e(n) U ^Vt 

The conservation rule for type II queues is thus x 3 in ,, + K? in r n \ = K J out ^ . 
For regulated type II queues, the queue dynamics are 



<[r + 1] " <[r] = T [yi[r] + < (n) [r] - ^ t(n) [r] + <[r] j ; 

for unregulated type II queues, 

<[t + 1] " <M = T (4[r] + < (n) [r] - ^ (n) [r] + <[r]) . 

z^[t] is the unused rate (pkts/time slot) in super time slot r when there are 
not enough packets in the type II queue; 

4H = max Wo U t(n)i T ] - 4M - < ( n)t r ] - <M> °) 
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if u J n is unregulated. 

4[r] = max{^ t(n) [r] - y 3 n [r] - < (n) M - <[r], 0} 

if u J n is regulated. 

Regulated or not, we have 

<[r + 1] - <[r] < T ((1 + 5K[r] + < t(n) [r] - [t] + 4[r]) . (3.25) 

3.9.2.2 T 

We define 

TT m (j)(i)R 
^(rn(j),g(i,j)) ~ ) 

7T m{j) (i)R 

/%(M')> m 0')) ~ j, i 

and 

#m = {/'(moO.fffyjJ'^toW)."*^))}^- 

Note that /2m is in the unit of packets/time slot. We assume that the trans- 
missions between mobiles and gateways do not cause interference to other 
transmissions. 

Define the network graph S = (N, L) where N = (|J e NqJ |J M and 



Let 
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Let A*fcf n2 ) be the rate at which the flow [s, d] is served over the link 
(^1,^2) G £■ We say x is supportable if there exists jl G V such that: 

1. For any node ri\ G INT and for all flows [s, d], 



U 1 [s,d] _ M 

f J-{ s =ni}+ 2^ ^(n 2 ,m) _ ^(ni, 
(na,ni)&C (rai,n 3 )e£ 



2 - {Em^bj)} g r - 



We say x intra (e i ) is supportable in if the conditions 1) and 2) above hold 
with L and V replaced with Lq. and Tg., respectively. 

3.9.2.3 z^mim K max5 and 7V U 
Let 



^mm = min < min < max n a > , R/T 

e l>,6)ee = x iaterCe) (J4er e 



and 



K mas = max < max < max n a > , R/T 

{ e {(a,b)ee : x intcr{e) u«£er e J 

We assume that rj > K max . We will see in lemma 4 that there is at least one 

type II queue in each cluster that is guaranteed to transmit or receive at rate 

K min or higher. K max is the maximum rate at which any type II queue in the 

network could change. Lastly, let 

N u = N s ySf g + max{max{# of inter-cluster traffic srcs 
in 6, of inter-cluster traffic dest. in C}}) + N g 
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and let k = {^}- Note that N u > # of type II queues maintained at any 
node, and N u > # of nodes with at least one type II queue. 

3.9.3 Proof of Theorem 2 

We will first bound the queue lengths when the routing algorithms 
(Eq. (3.1), (3.2), (3.3), and (3.4)) are updated every T super time slots; we 
will refer T super time slots as a super-super time slot. We assume that T 
is large enough so that for any mobile m and any gateway g m comes into 
contact with, m makes at least (1 + e)~ 1 (7r m ) 9 T contacts with g over T super 
time slots. Let f denote f-th super-super time slot. We will use this bound 
to obtain the upper bound when routing algorithms are updated every super 
time slot. 

We prove stability using Lyapunov analysis. The Lyapunov function 
we choose is a quadratic function of the type II queue length. We will show 
that if any one of the type II queues reaches a certain threshold, the Lyapunov 
function will start to decrease. This will show that the Lyapunov function is 
bounded, and that the queues are bounded as well. 

We will use k„[t] to denote the transmission rate from node a to b in 
the overlay network in super time slot r, and K c , a ^ [r] denote the rate at which 
packets from u c a are transmitted from a to b in super time slot r. 
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We define a Lyapunov function V[fT] = J2 n J2j( u nl T T}) 2 - Let 
A f V[ff] = V[(f + l)f } - V [ff } 

= EE«^+i)^]) 2 -«[rT]) 2 

n j 
n j 

X«[(f + l)f]+<[ff]) 

= E E A<[fT](<[(f + i)f ] + <[ff ]) 

n j 

where 

A<[ff] = <[(f+l)f]-<[ff] 

(f+l)T-l 

i=ff 
(r+i)f-i 

< T J2 ((l + 04M+<„)M-«L(»)M+4M)(3-26) 
2=fT 

Eq. (3.26) follows from the following queue dynamics: 
<[t + 1]-<M < T((l + 5K[r] + ^ n(n) [r]-< ui(n) [r] + ^[r]) (3.27) 
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which we derived in Eq. (3.25). Thus, 

(f+i)f-i 

AfV[ff\ < E [a+^nt'i+^w 

n j l=ff 

+ M(f + m + <[ff ]) 

^ r EEE[( 1+5 )<w+< ( „)W 

n j « 

^ rEEE 2 <[^][(i+^] 



where C = 4N^(TT ft max ) 2 because if w£[fT] > TTK max , then = since 
there are enough packets; and z^[l] < ft max always. (ft max is defined in subsec- 
tion 3.9.2.3. K max is the maximum rate (pkts/time slot) at which any type II 
queue can change.) 

Let 

(f+i)f-i 

A x [ff] = ^E E E <[ ff Kn { n)[l\ 
n 3 l=ff 

= TfJ2T,<^ f Hn {n) [rT] (3.28) 

n j 

since (3.1) is assumed to be updated every T super time slots; let 

(f+i)f-i 

B K [ff] =TJ2J2 E <[^](«L(n)W-<(n)W) 
n j l=ff 

(f+l)f-l 

= T EE E (<irT]-uUff])Ki n>m) [l]. (3.29) 

j (n,m) l=ff 
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Note that since (3.1) is updated every T super time slots, 

A x [ff] =TfJ2 v min v ( u9 Aff] + uf s [ff] + u d JfT]) x d s . 

Then, 

A f V[ff] < 2(1 + 8)A x [ff] - 2B K [ff] + C. (3.30) 

Since (1 + 5 + e)x is supportable, there exists ft G T and k such that 
/2 G T and 

= E E <\ ff \ ^ + 5 + e X(n) + ^n(n) ~ «L(n)) • 
n j 

(r is defined in subsection 3.9.2.3. It is the convex hull of all possible sched- 
ules, including all possible mobile-gateway schedules, k is a vector of possible 
transmission rates in the overlay network.) 

Letting A x [ff] = TfJ2 n J2 jU i[ff]xl {n) and 



B K [ff] = TfJ2J2<l ff ^Un)-<n ( n 



n j 



Tf^^«[fT]-<[fT]) fi ; 



J 

"(n,m) ' 
j (n,m) 



we have (1 + 5 + e)A x [rT] = B k [tT] because (1 + 5 + e)x is supportable. 
Hence, 

A f V[ff] < 2(l + 5)A^[ff]-2B K [ff]+C 
-2(l + 5 + e)A x [ff]+2B K [rf] 
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Since A x [rT] < A x [tT] and B K [fT] < B k [tT] (see Lemmas 3 and 4 in 
the Appendix of this chapter), 

A f V[ff] < -2ei x [ff] + C. 
If A x [ff) > C/(2e), then A f V[ff] < -C. 

If A x [ff\ < C/(2e), then from Eq. (3.30) and the fact that A x [ff] < 
A x [fT] we have 

A f V[ff] <C-2B K [ff] 
where C = C + ^-C. If there exists (m, n) and j such that 

<[ff ] - <[ff ] > -S— (3.31) 

1 J rC m i n 

then AV f [ff] < -C (plug Eq. (3.31) into Eq. (3.29)). («; min is defined in 
subsection 3.9.2.3. It is the minimum rate at which at least one type II queue 
is guaranteed to change.) 

Such (m,n) and j that satisfy Eq. (3.31) exist if there exists m such 
that u J m [fT] > J^ uC since u™[fT] = and there are N u nodes with type II 
queues (N u is defined in subsection 3.9.2.3). 

Thus, V[ff] < (K) 2 where 

K = N U (J^— + Km&x TT 

\ "^min 

which implies that 

<[ff ] < K (3.32) 
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for all n and j when our algorithm is updated every T super time slots. 

We now consider the real queue u J n [fT] where n and j are gateways. 
Let M^[fT] be the number of packets in the network that are yet to arrive at 
the regulated queue u J n yet. Assume there is f such that for f > f , we have 

Mi[ff}>N u K + Tf Kmax 

which implies m^[tT] > TTft max due to Eq. (3.32). The packets that did not 
arrive at u J n must be held some where, and because there are fewer than N u 
nodes with type II queues and each regulated type II queue is bounded as in 
Eq. (3.32), we must have u J n [fT] > TTK max . Thus, we get 

Mi[(f+l)f] = frl Yl 4i[rT]\+Mi[ff]-fTyi[fT] 

\[s,d]65intor / 

< Miiff] 

which implies that for f > fo 

Mi\rT\ < max [Mi[f f ], N U K + 2TTn max j . 

Thus, real queues u^fT] are bounded as well. 

We now examine the real queue when n is a gateway and j is an inter- 
cluster traffic destination. Let 23 denote the bound on u J n [tT] where n and j 
are gateways. Let M^[fT] be the number of packets in the network that are 
yet to arrive at the regulated queue u J n , n G Uq'Kq, j G Je(n)- Assume there is 
fo such that for f > f , we have 

Mi\rf\ > N U K + Ng'B + TT/€ max 
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which implies w{[tT] > TTi% max due to Eq. (3.32) and since real, gateway- 
to-gateway queues are bounded by H> and there are only N g source gateways 
where these packets can be. Thus, we get 

Ml[(f+l)f] = fT^x:;^fTU+Mi[ff}-fTyi[fT} 
< Mi[ff] 

which implies that for f > f 

Ml[ff} < max [Mftfof ], N U K + Ng'B + 2TT/w} . 

Thus, real queues u^fT] are bounded as well. 

Now, consider when our algorithm is updated every super time slot (as 
is in Eq. (3.1), (3.2), (3.3) and (3.4)). We have 

(f+i)f-i 

A f v[ff] = EE E K[' + i]) 2 -Kfl) 2 

n j i=ff 

= E E E W + !] - KP + !] + <W 

n j i 
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By Eq. (3.25), 

A f V[ff] < £ [(1 + W1+<(„)M 



(f+l)f-l 



« 3 l=fT 



n j Z 

-«L(n)M+4fl] (2<[/]+T Kmax ) 
n j I 



C" 



< 2(l + S)A x [f}-B K [f}+C 



where C" = 4TiV2 (T Kmax ) 2 and 



and 



(f+l)T 

« i «=fT 
(f+i)f 



4[ff] =T^^^ <[/] (< ui(n) [/] - < (n) [/]) . 
" i j=ff 

By Lemmas 5 and 6, 3 Ca, Cb such that A x [fT] < A x [fT] +Ca and B K [fT] > 
B K [fT] — Cb] thus, we have 

A f V[ff] < 2(1 + 6)A 3C [fT] - B K [ff] + C" 

where C" = C + + Cg. Then, following the same analysis as before, we 
get 

(N C" 
+ KmaxTT 
-L J- ^min 
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using algorithms (3.1), (3.2), (3.3) and (3.4), where C" = C" + (1 + S)C"/e. 

Following similar reasoning in [82], we can bound the real queues. 

Note that the probability that the mobile does not exhibit stationary 
distribution in T super time slots is exponentially decreasing in T. Thus, we 
can obtain an expected bound on the type-II queues. Furthermore, we can 
use theorem 1 of [50] to bound the type-I queues since in eqs. (3.2), (3.3) and 
(3.4), type-II queue sizes are used as a linear utility function. ■ 

3.9.4 i x [rT] > A x [ff) 
Lemma 3. A x [ff] > A x [ff] . 

Proof: 

^ Tf E Q ^ {< ^\ + uf s [ff\ + < [ff\) x d s 

= A x [ff] 

since V ^ 5~\ ^ x 9 J' d = x d , and because of algorithm (3.1) and the 
fact that the source routing is updated every T super time slots. ■ 



3.9.5 B K [ff] < B K [ff] 
Lemma 4. B K [ff] < B K [ff}. 
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Proof: In each cluster C, there are intra-cluster traffics mixed with source-to- 
gateway relay, gateway-to-gateway relay, and gateway-to-destination relay, all 
of which form parts of inter-cluster traffic. 

We will use the theorems from [50] to prove our lemma. 

In the super-super time block f, let the utility of each intra-cluster 
traffic [s,d] in C be U[ Si( ^(x[ St( ^) = ax[ s ^, where a is some constant. 

For each source-to-gateway relay that originates from s G 6 ending at 
gateway g s G 6, let the utility function be 

For each gateway-to-gateway relay between gi and g^, let 

For each gateway-to-destination between g& and d, let 

We assume that a » 9f'[fT\,9%[f^,0%JrT\. 

It is shown in Theorem 1 of [50] that our transfer algorithms (3.2), (3.3) 
and (3.4) solve the optimization problem 

Maximize: ^ U M ( x ) + E ^ [s,g d ) ( K f ) 
Subject to: {x[ s ,d\ ■ ^intrace)} |J{«a} e T e 

{%[s,d]} < Xintra(e) 
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Since a » 6 9 s 3 [tT] , 6 9 * [rT] , 8g [tT] , we assume that the above optimization 
is solved with {£[ s ,d]} = x intra (e). Then, the above optimization becomes 

Maximize: E^l^f ) + E + E f/ ^«) 

Subject to: x intra(e) [J{«*} E T e , 

which is solved by transfer algorithms (3.2), (3.3) and (3.4) according to [50]. 

Let {kJ[tT]} be the set of values that maximize the above optimization 
problem. Then, 



TT E E («M - <[ fT < m) [rT] > 

j (n,m):n,m^M 

TT E E (<[ f f ] " <[r?]) n^ m) (3.33) 

for each cluster C. 

Let l{m k =gi k }[l] = 1 if m(k) and g(i, k) are in contact in super time slot 
/. We now consider the mobile-to-gateway contacts. 
f-i 

E E Eww^ +l ^ R 

1=0 m k £M g^ k 



+ (vs mfc)[ " 1 [^]-^r mfc>[ " 1 [^])] ( 3 - 34 ) 
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since every super time slot and are in contact, R packets are uploaded 
and downloaded and thus 



T-l 



T T^(m k ,^ k )[rT] = ^\ {mk=g . k} [fT + l]R, 

1=0 

T-l 

T TK{g i<k ,m k )[fT] = ^2l{m k =g iik }[rf + l]R. 



1=0 

Since we assumed that any mobile m makes at least (1 + 7)~ 1 (7r fn ) 9 T 
contacts with g (g is any gateway m can come in contact with) over T super 
time slots (see proof of theorem 2), we have 

T-l 

> (i + t)- 1 ^)^^ 

1=0 

and thus 

Eq. (3.34) > ^ E(l+ 7 )- 1 ( 7 r m J^T^ 

S[Um k [TT\-Ug ith [TT\J 

+ [ U 9i,k [TT\-Um k F] (3.35) 



Since (1 + 7)(1 + 5 + e)x is supportable, we have 



J 

In addition, because we have 
u 
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for any j by gateway-B-mobile back-pressure algorithm equations (3.6) and 
(3.7), we have 

RHS of Ineq. (3.35) > TT £ £ £ (<Jff] - <Jff]) 

H W )(<J^]-<J^])- (3-36) 

LHS of Ineq. (3.33) + Eq. (3.34) = B K [ff] , and RHS of Ineq. (3.33) + RHS 
of Ineq. (3.36) = 5 K [fT]. Combining ineqs. (3.33), (3.35) and (3.36), we have 

B K [ff]>B K [ff\. 

■ 

3.9.6 A x [ff] < A x [ff] + C A 
Lemma 5. A x [ff] < A x [ff] + C A . 

Proof: Let r; denote the /-th super time slot in super-super time slot f, where 
/ = 0,1,...,T — 1. For any [s, d\ G Sinter, since K max T is the maximum amount 
by which any type II queue can increase in a super time slot, we have 

min («? [ff + r l+1 ] + <[ff + r l+l ] + < d [ff + r m ]) 
< min (uf [ff + 7i] + < [ff + 7i] + < [ff + r,]) + 3/wT. 
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Hence, 



r-i 



< 



mm l uj- [fT + r ] + < [f T + r ] + < [fT + r ] 

f-1 



T 



+3/=c max T ^ ^ / 



from which we get 

f-i 



E E 

Z = l [Minuter 



9s, 9d 



min «f [ff + tj] + u%\ [ff + tj] + < [ff + r,] 



xtT 



s E 

| 3(K max T) 2 (T-l)T 



min M f [fT + r ] + < [fT + r ] + < [f T + r ] 

9s,9d 



TTx< 



(3.37) 



Summing over s and <i, the LHS of Eq. (3.37) yields A x [rT] and the RHS yields 
A x [ff]+CUwithCU = 1.5(N u K max T) 2 (f-l)f; hence, i x [ff] < A x [ff}+C A . 



3.9.7 4[ff] > B K [ff ] - C B 
Lemma 6. B K [ff] > B K [ff] - C B . 

Proof: Let t\ denote the Z-th super time slot in super-super time slot f , where 
I = 0, 1, . . . , T — 1. For any (m, n) G £, let 



PLn)lfT + 71] = <JfT + n] - <[ff + 71]. 
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Since K max T is the maximum about by which any type II queue can 
increase/decrease in a super time slot, we have 

P(m,n) l ff + ^ ^ P Ln) + n] ~ ^ max T (3.38) 

for any (m,n) G L. From Eq. (3.38), we get 

PUn) l ff + ^ P kn) \ ff \ - 2/T «~ (3.39) 



Let 



a = argmax ^ m f x [ P (m,n)\T^\) A*(m,n) 



(m,n) 



and 



(m,n) • 



£[tj] = argmax V" max [Pf mn) [ff + r { ] ) H( r 
tier 3 ^ ' 

(m,n) 

FromEq. (3.39), 

Tf max (P(L >n) [fT]) (a) (m>n) - A^TTfNl 

(m,n) 

f-1 

< T EE m f ( P (U + ^)) ( 3 - 4 °) 

2=0 (m,n) 

< T E E max ( P (U)[^ + n}) (bin}) . (3.41) 

=0 (m,n) 



S„[fT] = TT £ max (^[ff]) (a)^ - 4(/, max TT) 2 iV 1 

(m,n) 
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since algorithms (3.2), (3.3), (3.4), (3.6), and (3.7) are updated every T super 
time slots. 

f-i 

J V v ' 1 / V / (m,n) 



4[f f] = T £ £ max (Pj n)n) [rf + r,]) (fo] 



Z=0 (m,n) 

since those same algorithms are updated every super time slot. Thus, from 
Ineq. (3.40) and (3.41), our lemma holds with C B = 4(K max TT) 2 Nl ■ 
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Chapter 4 



Efficient Data Transport with Mobile Carriers 

4.1 Introduction 

Data delivery across "disconnected clusters" of nodes using mobile 
nodes are of increasing interest. Applications include those in Disruption Tol- 
erant Networking [24], battlefield networks, and more generally in scenarios 
where there is a lack of infrastructure. Mobile nodes potentially serve multiple 
functions, e.g., surveillance and monitoring of the region, along with support- 
ing data delivery. Further, in many applications, there is likely to be some 
flexibility in choosing the trajectories of these mobile nodes (i.e., controllable 
mobility) . 

For concreteness, consider an exploration outpost in a remote corner of 
the world. At such a location, it would be difficult to establish infrastructure 
for traditional cellular or WiFi networks due to cost, availability of power 
sources, etc. Relying on satellites can be expensive and would only support low 
data rates. At such remote locations, one can utilize a group of reconnaissance 
mobiles (such as UAVs) to transport data from one part of the network to 
another. These UAVs can be used to patrol the premise periodically in order 
to ensure security, and they can be readily equipped with radio transceivers 
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to pick-up and drop-off data at different locations as they patrol, thus serving 
a dual purpose. 

Because these UAVs play a critical role in providing connectivity, there 
has been a surge of interest in developing reliable and efficient algorithms for 
these types of networks that use mobile data carriers. However, due to the 
opportunistic and intermittent nature of the mobile connections (the wireless 
connections are formed and broken as the mobiles move about) and high link 
delays, the traditional routing and rate control algorithms, such as OSPF and 
TCP, used in the Internet suffer performance degradation if used in highly 
intermittent and opportunistic environments. 

One way to mitigate the problem of opportunistic and random con- 
nectivity is through controlled mobility. By controlling the motion of mobile 
data carriers, one can make the connections less opportunistic and random, 
and more periodic or more predictable. There is a vast amount of literature 
available on controlled mobility, ranging from robotics to operations research 
[12,62]. Extensive study has been done on problems such as minimizing the 
travel cost subject to some constraints and finding an optimal routes for pick 
up and drop off of goods [58]. 

In this dissertation, we focus on minimum cost dynamic routing with 
controlled mobility. Specifically, we study a network of stationary nodes that 
rely on mobiles to transport data between them, and the data rates are not 
known and may vary over time. The cost for data transport consists of two 
parts: 
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First, there is a per-packet per-route cost - this reflects the cost of 
transmitting a packet over a specific route. For instance, such linear costs 
have been used in literature [81] to minimize hop count. This cost could be 
source dependent (e.g., a hard-to-reach source might be penalized with a higher 
cost). In our study, we allow any source and mobile dependent per-packet cost 
to reflect this. 

Second, there is a per-route cost - a mobile is allowed to periodically 
change trajectories, and the cost is a function of the trajectory that is cho- 
sen. For instance, longer trajectories (that potentially use more fuel) could 
be penalized with a higher cost (in our model, we allow any positive cost per 
trajectory). 

In this dissertation, we design an algorithm that will 1) guarantee 
throughput optimality and 2) minimize the sum cost over the entire network. 
We do this by enabling the mobiles to control their own routes of operation 
in response to the traffic demand. Without controlled mobility, one would 
have to resort to fixing the routes a priori, and this could lead to an unstable 
network, as we demonstrate in the next section. 

4.2 Illustrative Example 

Consider the simple network shown in Figure 4.1. We have one mobile 
that can choose from two "routes." In the left route, it will come into contact 
with stationary nodes SI and S2 (in that order); in the right route, with S3 
and S4. On contact, the mobile can drop off and pick up data to and from 
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Figure 4.1: A simple network with two modes: The mobile can choose to orbit 
the left route or the right route. If it chooses the left (right) route, the mobile 
will come into contact with stationaries SI and S2 (S3 and S4). On contact, 
the mobile can drop-off and pick-up data to and from the stationaries. In this 
figure, SI generates a stream of data for S2 (S3 for S4). To serve the flow 
S1-S2, the mobile has to go into the left route, pick up data from SI and drop 
them off to S2. If the flow S1-S2 has higher rate than S3-S4, the mobile should 
go into left route more often than right route. 




50 100 150 200 250 300 

Time (mins) 



Figure 4.2: If the route operations percentages are set to 50-50 a priori, the 
source rates we have in Figure 4.1 can not be supported and will lead to 
unstable queues. However, as long as the source rates are in the capacity 
region (which takes into account the patrol requirements), our algorithm will 
support them and stabilize the queues. 

the stationary nodes. Each route requires two minutes to finish, and after one 
route is finished, the mobile returns to the center and can choose the next 
route. On each contact, 200 packets can be picked up or dropped off by the 
mobile. In addition to transporting data between nodes, the mobile also has 
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a purpose of patrolling the area, and must travel each route at least 15% of 
the time. 

SI generates data destined to S2 at a rate of 70pkts/min, and S3 gen- 
erates data destined to S4 at a rate of 20pkts/min. The mobile does not know 
the data rates, and the rates can vary over time. If the mobility pattern of the 
mobile is fixed, then the network may not be able to support the two traffic 
flows. For example, if the mobile patrols each route 50% of the time, then the 
flow from SI to S2 cannot be supported in the network and the queues are 
unstable, as shown in Figure 4.2. 

In this dissertation, we develop an algorithm that controls the mobility 
pattern of the mobile dynamically so that it not only satisfies the surveillance 
requirement but also stabilizes the queues whenever it is possible. In the 
simple example in Figure 4.1, using our algorithm, the mobile may patrol 
the left route 75% of the time and the right route 25% so that each route is 
patrolled at least 15% of the time (surveillance requirement), and both traffic 
flows are supported and all queues are stable. 

4.3 Related Works 

The networks that utilize mobile carriers to transport data have been 
studied extensively recently by [8, 15, 16, 24, 27, 33, 43, 51, 53, 65, 74, 77, 79] and 
others. The primary focus of [43,65,74,77] is to increase the data delivery 
probability and reduce delivery latency through replication in the context of 
delay-tolerant networks (DTN). Replication is useful in networks where mobile 
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carriers move randomly because it increases the opportunities to transfer data 
from mobile nodes to static nodes and vice versa. In networks where the mo- 
bility patterns of mobiles are fixed, replication is not necessary However, the 
drawback of fixed mobility patterns is that the network cannot dynamically 
respond to changes in the traffic loads. In addition, as long as data is deliv- 
ered to the destination, it is considered sufficient, but in networks where the 
mobility pattern can be controlled, we can not only guarantee data delivery, 
but also the most efficient and optimal network resource utilization. 

An extensive simulation study of a network where the mobile messen- 
gers are used to transport data among clusters has been done in [27]; there, 
the impact of different mobility patterns of the messengers on delay and ef- 
ficiency is examined. In [18] and [70], algorithms that control the mobility 
of a mobile data collector in a sensor network to reduce data collection de- 
lay have been developed. Both papers explore the trade-off between mobility 
and wireless transmissions energy. In [20] a trajectory control algorithm such 
that the mobile data collector dynamically switches its trajectory to be closer 
to sensor nodes with more data to transmit in order to save TX power has 
been developed. In addition, [36, 61] explore sensor networks where nodes can 
reconfigure their positions dynamically in order to enhance coverage and life 
span. The work in [44] is a study of a routing protocol based on controlled 
mobility that minimizes the distance traveled by mobile carriers to reach the 
destination. However, the mobile messengers in [18,20,27,36,44,70] cannot 
adapt their mobility patterns to traffic loads, the result of which will be unsta- 
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ble queues as illustrated in the previous section. Experimental evaluation of 
a heuristic mobility control algorithm that can respond to changing network 
capacity and demand is presented in [17]. Optimization based approaches for 
mobile data collection have been studied in [63], where the authors study the 
problem of mobile sinks that need to collect data from various sources before 
their respective buffers overflow. The algorithm based on using the deadlines 
(time to fill buffers at various nodes) is shown to be NP-complete, and various 
heuristics are then explored to alleviate this. Further, in [28], the authors 
study the problem of transporting data to a single collector from a collection 
of stationary data generators via reinforcement learning techniques. 

The optimization framework based on back-pressure [73] that is used 
in our dissertation has been used extensively in [22,38,48-50,52,68,81,82] 
and many others for developing efficient resource allocation schemes in wired 
and wireless networks in the context of congestion control and back-pressure 
routing and rate control. The networks studied in these papers consist of 
static nodes, and the links are not intermittent, while in this dissertation, we 
focus on intermittently connected networks, and develop an algorithm that 
controls mobility to support network traffic flows while guaranteeing some 
other objectives such as surveillance requirement. The optimization algorithm 
that is most similar to ours is the one developed in [49]. The focus of [49] 
however is on an abstract problem of optimizing stochastic renewal systems; 
ours focuses on minimizing the cost of transporting data over mobile networks. 

In this dissertation, we combine the optimization framework used for 
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back-pressure routing with mobility control in order to develop a dynamic 
throughput and cost optimal mobility control algorithm that allows multiple 
mobiles to transport data among a collection of stationary nodes. Our contri- 
butions include: 

1. We formulate a cost minimization framework for the network where the 
mobile carrier adapts its mobility pattern to support traffic flows among 
stationary nodes while satisfying a secondary surveillance objective. We 
present the min-cost mobility control algorithm that is throughput and 
cost optimal, and then develop a practical distributed algorithm. 

2. We implement our practical distributed algorithm and present experi- 
mental results on our Pharos test bed [69] using the Click router [39], 
where we implement the radio and network aspects and emulate mobility. 

4.4 Network Model 

The network consists of L stationary nodes and one mobile carrier 1 . 
The stationary nodes do not move and can not communicate with each other 
directly; they must rely on the mobile carrier to transport data among them. 
We assume that stationary nodes generates data for other stationary nodes. 
Let di denote the destination node of the data stream generated by stationary 
node I, and let xf (pkts/time slot) be the corresponding average rate. Let 

^-We assume one mobile carrier only to simplify the notations. This assumption however 
can be easily removed, see section 4.5.2 for the multiple mobile formulation. 
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x = <^x l 1 >. Define 

, r i if z' = rf, 

^ " \ else. 

A stationary node can exchange data with the mobile carrier when the 
two come into contact. During each contact, the mobile carrier can send rjd 
packets to the stationary node, and receive r) p packets from the stationary 
node. We call the transmissions from a stationary node to the mobile carrier 
a pick up, and the transmissions from the mobile carrier to a stationary node 
a drop off. 

We assume that there is a terminal V in the network. This terminal 
does not generate data. It is there to facilitate notation and understanding of 
the definition of route, which we define next. 

Definition 1. A route of the mobile carrier starts and ends at the terminal. 
The route is a (finite) set of tuples (s,n s ) where s is a stationary node and n s 
is the number of times the mobile visits s on that route. The route is further 
specified by the time required to patrol that route. ■ 

Assumption 4. We assume that there are J routes for the mobile carrier, 
which are indexed by j. Stationary node I is assessed a cost aij for every 
packet picked up by the mobile to be sent over route j (aij is called pick up 
cost J. The mobile incurs a cost of bj per time slot when patrolling route j. ■ 

An example of a route is 

R x = {(y,2),(h,l),(l 2 ,4:),10mins}. 



128 



On this route, the mobile starts at V, visits l\ once and I2 four times before 
returning to V . The time the mobile takes to patrol R\ is lOmins. The route 
Ri is different from the route R 2 = {(V, 2), (h,2), (Z 2 , 4), lOmins} because li 
is visited twice on R2 but only once on R\. R\ is also different from the 
route i?3 = {(V, 2), (li, 1), (I2, 4), 5mins} because it takes less time to patrol 
R3. Note that the mobile carrier must return to the terminal before switching 
onto another route. Though not shown, the terminal in the network in Figure 
4.1 would be located where the left and right routes meet (right under where 
the mobile is). 

We can associate higher bj with the routes on which the mobile moves 
faster since that would require more fuel. We assume that aij < a max , V7, j 
and bj < 6 max , Vj. We let fj denote the fraction of time the mobile carrier 
is on route j, and Tj denote the time required to patrol route j (in units 
of time slots). Assume that T min < Tj < T max , Vj. If N routes have been 
patrolled, and out of the N routes, the mobile patrolled route j Nj times, 
then fj = j^., n J ,T; ■ N°t e that not all stationary nodes may be included on 
one route, so the mobile may have to switch from one route to another to 
transport packets from a source to its destination. 

Definition 1 specifies a route of a mobile via the the collection of sta- 
tionary nodes that a mobile visits, the number of times that each of them is 
visited, and the total time taken to physically traverse this route. Note that 
several physical paths (i.e., the actual geographic paths) can share the same 
mobile route as specified by this description (e.g., the difference between two 
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physical paths could be the order in which the stationary nodes are visited, or 
that the actual trajectory could be different; however, the path characteristics 
are summarized by Definition 1 could be the same). In this case, multiple 
physical paths would be mapped to the same route. The reason for our choice 
of these parameters to define a route is that the list of stationary nodes along 
with the number of times that they are visited describe the transfer capacity 
between the mobile and stationary nodes, and this along with the time dura- 
tion of the mobile route is needed to describe the rate of data transfer between 
the mobile and stationary nodes (rate = (number of contacts) x (packets trans- 
ferred per contact) / (time duration of mobile route), see Table 4.1). Different 
physical paths with the same route parameters in Definition 1 lead to the same 
data transfer constraints, hence we do not distinguish between them (as the 
rest of the physical route properties are not relevant to our model for data 
transfer), and our algorithm will treat them all as the same route. 

Note that if for some reason, we need to distinguish between physical 
paths that have the same route (e.g., with different costs), it is easy to do so 
by one of two means: (i) simply change the original route time durations (T, 
which is originally the same for the two routes) to be (T — e) and (T + e), 
for some arbitrarily small e; or (ii) augment the notation in Definition 1 to 
have an additional route-index parameter (that distinguishes between the two 
physical paths). All that this will change is to add an extra route queue in 
the Min-Cost Mobility Control Algorithm (see Section 4.5), and the algorithm 
will proceed to load-balance between these two routes based on the costs. 
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Surveillance Requirement: The mobile must periodically patrol route j to 
guarantee that fj > pj for some pj > 0, V j. ■ 

We let Cij denote the number of contacts that can be made between 
the mobile carrier and stationary node / on route j. Let Pij and Dij denote 
the rates that the mobile can receive from and transmit to node I on route j, 
i.e., P u = ^ and D tJ = 



Each stationary node / maintains a queue qij for each route j, j = 
1, J. Node / deposits into queue qij the packets it wants the mobile to pick 
up on route j. The mobile maintains a queue Qy for each destination node 
l' = l,...,L. 

Definition 2. We say the network is stable if all queues are bounded. ■ 



We let 



1 Pi,j > 
" 1 else. 



Finally, we define 



?7max = max <^ max {PijTj} , max {D^Tj} 

I l,j 1,3 



4.5 Min-Cost Mobility Control 

In this section, we will introduce our min-cost mobility control algo- 
rithm. The variables associated with our algorithm along with their brief 
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Notation 


Description 


d, 

x i 


rate stationary / generates data for stationary d\ 


di 

vu 


rate mobile picks up data from I on route j, destined for d\ 


pj (fj) 


minimum (actual) fraction of time mobile spends on route j 


T 


time duration of mobile route j 


a hi 


cost / pays to have a packet picked up on route j 


b 3 


cost per unit time mobile pays to spend on route j 


%3 


queue I maintains for packets to be picked on route j 


Qi 


queue mobile maintains for packets destined for I 




fraction of total resource on route j stationary I uses 


Wj 


counter mobile uses to satisfy the surveillance req. (4.3) 


K 


tuning param. that controls optimality and queue sizes 


K 


tuning parameter that controls the size of the counter Wj 


Vp 


# of packets picked up per contact 


Vd 


# of packets dropped off per contact 




# of contacts mobile makes with I on route j 




data pick-up rate from I on route j (T)pCi,j/Tj) 




data drop-off rate to / on route j (voCij/Tj) 



Table 4.1: Description of variables used 



physical description are given in Table 4.1. Define y t l - to be the (average, long- 
term) rate at which data generated by stationary node / is picked up by the 
mobile carrier when patrolling route j. The objective of the min-cost mobility 
control is to find f = {f j7 j = 1, J} and y = {yf l p l = 1, ...,N,j = 1, J} 
(pick up rate splitting) to (%) stabilize the network, i.e., guarantee that the 
queue lengths are bounded, (ii) minimize the average cost J2i j y?'j a i,j+J2j fyfj, 
and (in) satisfy the surveillance requirement. These decisions are determined 
adaptively via a dual-decomposition inspired mobility control algorithm. 

Definition 3. We say the arrival rate x is supportable if there exists (f , y) 
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such that: 



Ed, a 



(4.1) 



j 



Ea < i 



(4.2) 



J 



fj > Vi G [0,1], ./ 



1,..,J 



(4.3) 



= hjfjPij, VI, j 



(4.4) 



h,i'} 5 i,jfj p i,3 < J^/iA'j, vr 



(4.5) 




< (Jjj < 1. 



(4.6) 



Let A be the set of all supportable arrival rates. Given x e A, let T x 
denote the set of (f , y) that satisfy the equations in definition 3. An arrival rate 
x is said to be supportable if there exist f and y such that: (i) The summation 
of fj (the fraction of time a mobile spends on route j) is less than one (eq. 
(4.2)) and the surveillance requirement (eq. (4.3)) is satisfied; (ii) The rate at 
which packets are transmitted from stationary node I to the mobile while it is 
patrolling route j is upper bounded by the pick up rate by the mobile on route 
j (eq. (4.4)); (iii) The rate at which the data for destination node /' is picked 
up from all stationary nodes and on all routes is limited by the combined rate 
at which data is dropped off to node /' on all routes (eq. (4.5)). The parameter 
5ij controls the fraction of active contacts between node / and the mobile on 
route j; when the mobile and stationary node / come into contact, they have 
the option of using only part, if any, of the transfer capacity of the contact; 
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and (iv) The total rate the data is picked up from node / on all routes should 
be equal to the rate at which node I generates data (eq. (4.1)). 

Note that we cannot replace conditions (4.4) and (4.5) with 

//;', < frPu, (4-r) 

3,1 3 

Consider two sources S\ and s 2 , with destinations d\ and d 2 , respectively. 
Assume that si and s 2 can be visited by the mobile only when the mobile is 
on route Ro, and x d ^ > x d *. Suppose the pick-up rates from s\ and s 2 are the 
same (P Si ,r = P S2 ,r )- Then if we enforce constraint (4.7), then x d s \ = y s * Ro < 
fR Ps u R and x d s 2 2 = y d s 2 2 RQ < f Ro P S2 ,R = fR P Sl ,R - Assume that d 1 and d 2 are 
contacted by the mobile only when it is on routes R± and R 2 , respectively, and 
they are receiving data only from s\ and s 2 , respectively. In addition, suppose 
that the drop-off rates to d\ and d 2 are the same (D^^ = P J d 2 ,R 2 )- If we 
force constraint (4.8), then fB P su R Q < fR t D duRl and fR P Sl ,R = fR P S2 ,R < 
fR 2 Dd 2 ,R 2 which implies that the mobile has to spend just as much time on R 2 
as Ri, even though x d \ > x d ^. We resolve this issue by using 5ij. 

The objective of the minimum-cost mobility control is to solve the fol- 
lowing optimization problem: Given x £ A and for any fixed K > 0, 

minimize < K yf l j a ij + K fjbj > (4.9) 
I U j J 

subject to (f, y) £ T x . 

Let (^iY) denote an optimal solution to (4.9). 
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As we will see later, by suitably choosing the value of parameter K > 
in the optimization problem, the solution generated by our algorithm will be 
sufficiently "close" to the optimal. Next, note that multiplying both sides of 
constraint (4.3) by any positive constant k > does not change the condi- 
tion. Thus, the partial Lagrange dual of the optimization problem (4.9) is the 
following 2 : 

L(qi,j,Qi',Wj) = 

mm I k y?! 3 a u - E ® j (hjfjftj - yf.j) 

VljJjAi { l,j l,j 

- E Qv ( E /iA'j - E hMj p i,A 

l' V j 3,1 / 

+K E ffo - E KW i (fi ~ Pi) \ 

subject to 1) ^.y* = xf, 2) Ytjfj < !> and 3 ) S i,i e As we wil1 

see later, the parameter k is useful in our algorithm in order to match the 
time-scale of route selection with the time-scale of queue-length variation. 

We now observe that we can decompose the Lagrange dual into two 
subproblems: 

min A'E^y"-' • E^U 

<3 " 3 3 

s - t - E^ = ^ 

3 

2 In the notations, I' generally refers to the destination stationary, while I generally refers 
to the source stationary. If we want to be explicit, the destination of the flow originating 
from the source I is denoted d/. 
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for each I — 1, L and 



max 



Yl toJ S vfi P U + KW i (ft ~ Pi) 



(4.11) 



1,3 3 



+ Y / Qi'f j [D l ,, j -Y,h,i>A j Pi 





-k fih 



s.t. 



land ^e[0,l]V/,j 



3 



Motivated by the dual decomposition, we propose the following min- 
cost mobility control algorithm. Here, the index k denotes the k-th route 
selection. If the k-th. route is j, the time duration between k-th and (k + l)-th 
route selections is I}. In the min-cost mobility control algorithm, a stationary 
node deposits its packets into a queue that solves the subproblem (4.10), and 
a mobile station selects its route by solving subproblem (4.11). In addition to 
packet queues, each mobile station maintains a deficit counter for each route. 
The size of a deficit counter indicates the number of times the mobile node 
needs to further patrol the route to fulfill the surveillance requirement. 

Min-Cost Mobility Control Algorithm 

(i) Stationary node / deposits yf^k) packets into queue qij, where 




(4.12) 



and j*{k) = argmin,- {Kaij + qij(k)}. 
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(ii) The k-th route j*(k) selected by the mobile carrier is such that 
j*(k) G argmax <^ y^qi tj (k)8i d (k)P hj - Kbj 



+ E (dvj ~ E ^M*)^) 



+«Wj(A;)(l -pj-)} 



(4.13) 



where 

M*) = ( 4 - 14 ) 

else. 

The mobile will pick up data from I on route j*(k) if and only if Sij*^) (k) = 
1. In addition, 



i iij=j*(k) 

else, 



and 



(4.15) 



T(k) = 7> (fc) . 



(4.16) 



(iii) The queues are updated as follows: 

qij{k + l) = [q U (k) + 
T(k) (y^k) - 8 ld (k)P ld l {fjik)=1} 



Qv{k + i) 



E hhvAA k ) p id - D 



(4.17) 



(4.18) 
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(iv) The algorithm maintains a deficit counter Wj(k) for route j such that at 
step k, Wj(k) is increases by T(k)pj and decreases by T(k)fj(k), i.e., 



w 



,(* + !) = [«;,(*) + - l {fj{k)=l} T(k)] + . (4.19) 



The next two theorems demonstrate the stability and optimality of our 
proposed algorithm, respectively. 

Theorem 3. Given x such that (1 + e)x G A for some e > 0, under the 
iterative algorithm above, the network is stable. 

Proof: See Appendix A. ■ 

Theorem 4. Given x such that (1 + e)x G A for some e > 0, let ^fj,yf l jj 
be the solution to the optimization problem (4-9). As K — > oo ; under our 
proposed algorithm, 

1 



z — * " z — ' K— >oo ft'— >oc 



X 



\k<k' \ l,j j 



(4.20) 



Proof: See Appendix B. ■ 

Note that lim^oo ^ — Trpi) 2^fc<fc' T{k)yf l j(k) i n equation (4.20) is the 
time-average data rate that the mobile picks up from stationary node / on 
route j, and lim^oo ^ * T ^ J2k<k' T(k)fj(k) is the fraction of the time the 
mobile patrols route j. 



138 



4.5.1 Impact of K and k 

We know from [50] that (f , y) obtained by our algorithm is within a fac- 
tor of 0(1/ K) from (f, y), and while qij, Qy and nwj are 0(K max{r] mauX , fi;T max }), 
where max{?7 max , ftT max } is the maximum amount by which qij, Qi>, and KWj 
can increase between any two consecutive route selections. In order to keep qij 
and Qi' small and make the mobile go into "surveillance" route without having 
wj build up, we choose k such that ?7 max ps /tT max , i.e. k = 6 (?7. maa ./T max ). 

4.5.2 Multiple Mobiles 

So far, our model assumes only one mobile node to keep notation sim- 
ple. However, one can easily extend the algorithm to a network with multiple 
mobiles. A straightforward way to extend our model is to have the station- 
ary nodes maintain a queue for each route and each mobile, i.e., stationary 
node I maintains a queue ft, J)m for route j for mobile m. Let f m j denote the 
fraction of time mobile m operates on route j. We then have the constraint 
J2jeM m fm,j < 1 f° r eac h mobile m, where M m is the set of routes mobile m 
can patrol. Each mobile also has its own "surveillance" or secondary objec- 
tive constraints on f m ,j- Finally, each mobile solves the optimization problem 
(4.11) independently, without any cooperation with other mobiles. 

It is easy to show that the results in this dissertation immediately carry 
over to this more general setting (the proofs are analogous to those presented 
here). Lastly, we note that this more general formulation supports multiple 
terminals, so that different mobiles can return to different terminal after each 
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patrol. 



4.6 Practical Algorithm 

The min-cost mobility control algorithm we discussed in the previous 
section has two shortcomings. One is that the source nodes have to synchronize 
their queue selections with the mobile's selection. The second issue is that the 
mobile has to know qi/s to make the route selection. However, we can take 
advantage of the secondary surveillance objective of the mobile. While the 
mobile makes its surveillance round, it can collect the most up-to-date queue 
information while picking up and dropping off data, and use the information 
in selecting the next route. 

Let k denote the k th route the mobile operates on. We can recast 
the update equations (4.12), (4.13) and (4.14) into the following practical, 
distributed decision controls. 

Stationary node: The first time the mobile contacts stationary node 
I, it communicates the values of a^/s corresponding only to the routes on 
which it will visit stationary node /. For each new packet, the stationary node 
I computes 

ji = argmin Ka^ + q tJ (4.21) 

j 

and deposits it into qij*. 

Mobile: Each time the mobile meets stationary node I, it collects all 
queue size information from node I at the end of the contact. At the end of 
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the execution of a route, the mobile computes the next route fj,8i,j 6 {0, 1} 
by maximizing 



where qij denotes the most up-to-date information known to the mobile. 
4.7 Experimental Results 

We implemented the practical version of our algorithm as described in 
Section 4.6 in Click [39] on our testbed [69] (see also [59] for more details). The 
purpose of the experiment is to demonstrate that our practical algorithm can 
achieve the optimal value and to show that we can come arbitrarily close to 
the optimal value at the expense of longer queues as K — > oo. The algorithm 
presented in Section 4.5 assumes perfect knowledge of the queue length by the 
mobile, while the practical version does not. To this end, we build our experi- 
mental network with WiFi cards on the Proteus platform [69] . We emulate the 
mobility by timed contacts, i.e., the mobile node is, in our emulation, station- 
ary, and would make contact with one static node, then wait for some time 
before making contact with another static node. These timed contacts are 
implemented simply by turning on and off the appropriate wireless interfaces 
for the various nodes. 
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Region C 




Figure 4.3: A network with one mobile and 12 stationaries, 4 in each region. 
The mobile has two modes per region: a fast and a slow route. In fast route, 
the mobile goes through a region and comes back to the center in one minute; 
in slow route, it takes two minutes. All stationaries in the region are contacted 
in each route. (The terminal would be located where the three regions meet.) 
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Figure 4.4: Queue lengths observed at SI: Increasing K results in more optimal 
rate allocations, but it comes at the price of longer queues and longer time for 
convergence. 
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Figure 4.5: Queue lengths observed at S4 



4.7.1 Experiment 1 

The network we used in this experiment is shown in Figure 4.1. We use 
the following routes: 

Ri = {(V,2),(Sl,l),(S2,l),lmin} 

R 2 = {(V,2),(Sl,l),(S2,l),2mins} 

R 3 = {(V, 2), (S3, 1), (SA, 1), lmin} 

R A = {(V,2),(S3,l),(SA,l),2mins} 

We used two flows, one from SI to S2, at the rate of 40pkts/min and another 
one from S2 to S3 at the rate of 30pkts/min. The mobile and the nodes can 
transmit lOOpkts per contact. The routes R2 and R4 must be patrolled at least 
10% of the time, i.e., p 2 = Pa = 0.1 and pi = p 3 = 0. asi^ = as2,R 1 = K and 
a si,R 2 = a S2,R 2 — 0; b Rl = b Rz = K and 6r 2 = b Ri = 0. The rate splitting over 
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K 






150 


300 


600 


Optimal 


II s2 

Vsi,Ri 


20 


18.376 


15.8 


15 


ysi,R 2 


20 


21.624 


24.2 


25 


VS2,Ri 


9.9 


8.598 


5.97 


5 


VS2,R 2 


20.1 


21.402 


24.03 


25 



Table 4.2: Experiment 1: The mobile has to travel the left routes R\ and R2 
to pick up data from S2 and travel the right routes R3 and R4 to drop off data 
to S3. The mobile must patrol routes R2 and R4 at least 10% of the time to 
satisfy the surveillance requirement. The stationary nodes and the mobile will 
try to utilize the "cheaper" routes R2 and R4 (thus, more packets are picked 
up on those routes) before the more costly routes R\ and i? 3 . (The units are 
pkts/min.) 

the modes scff and x s s \ 1 is shown in Table 4.2. (The optimal values in Tables 
4.2 and 4.3 are obtained by numerically solving the optimization problem (4.9) 
using MATLAB.) 

4.7.2 Experiment 2 

In this experiment, we used the network shown in Figure 4.3. The 
network is composed of three regions, A, B, and C. In each region, the mobile 
has two routes, fast and slow. We use Af and A s to denote the fast and slow 
routes in region A, respectively. (Bf and B s for region B and Cf and C s for 
region C.) On a fast route, the mobile goes through the region in one minute; 
on slow a route, the mobile takes two minutes. Each stationary node in a 
region is contacted once on each route made through that region. 

On each route, the mobile makes contacts starting from the lowest 
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numbered stationary to the highest. Each slow route is required to be patrolled 
at least 10% of the time, i.e., pa s = Pb s = Pc s = 0.1. On each contact, the 
mobile can pick up and drop off lOOpkts (200pkts total). 

SI generates data for S3 at rate 23pkts/min; S4 generates data for S6 at 
20pkts/min. S9 generates data for S8 at rate 20pkts/min, and S12 generates 
data for S10 at rate 23pkts/min. The packet pick up costs cli,b s , and ai y c a 
for the slow route are for all stationary nodes; the pick up costs a^A f , o>i,B f 
and ai t c f f° r the fast route are K. The route costs are b^ s = &b s = bc s = 
and b Af = b Bf = b Cf = K. We let K = 150, 450, and 900. k is set to 100. 

We compute the optimal rate splitting by solving the optimization prob- 
lem (4.9) (shown in the "Optimal" column) and compare against the observed 
rate splitting under different values of K. As predicted by Theorem 4, the 
rate allocation approaches the optimal allocation as K increases as shown in 
Table 4.3. The price of being close to the optimal rates is long queues, as 
demonstrated in Figures 4.4 and 4.5 for K = 150 and 900. 
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K 




di 

Vu 


150 


450 


900 


Optimal 


S3 
Vsi,A f 


10.488 


9.5 


8.51 


8.5 


S3 

ysi,A a 


12.512 


13.5 


14.49 


14.5 


S6 
VSA,A f 


7.6 


6.172 


5.4 


5.5 


VsXas 


12.4 


13.828 


14.6 


14.5 


y's9,c f 


7.64 


6.112 


5.378 


5.5 




12.36 


13.888 


14.622 


14.5 


7/ S10 

ysi2,c f 


10.78 


9.33 


8.372 


8.5 


7; S10 

ysi2,c s 


12.22 


13.67 


14.628 


14.5 



Table 4.3: Experiment 2: As K increases, the rate splitting approaches the 
optimal. (The units are pkts/min.) 

Appendix A: Proof of Theorem 3 

Consider the Lyapunov function V{k) = Y2n (Qijik)) 2 + J2i> (Qi'(k)) 2 + 
(wj(k)) 2 . Define AV(k) = V(k + 1) - V{k). Note that 

&V(k) < ^ / (qij(k + l)-q l)j (k))(2q ltj (k)+T kam ) (4.23) 
id 

+ + 1) - Qi'W) (2Qi'(k) + r/ max ) 

v 

+K^( Wj (k + 1) - Wj [k)) (2w j (k) + T max ) 

3 

since (qij(k + l)) 2 - (qij(k)) 2 = {q ltj (k + 1) + qi,j(k)) (qij(k + 1) - qij(k)) and 
{qij{k + 1) + qi,j{k)) < 2q id (k) + r] max . (Likewise for Qv(k) and Wj(k).) The 
maximum amount by which qij(k) and Qi'(k) can increase in one iteration is 
f| max ; the maximum amount by which Wj(k) can increase in one iteration is 
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T 

- L max- 

We prove that there exists g max such that if qij(k) or Qv{k) > g max for 
some I, j, I', and k, then 

AV(k) < -a (4.24) 

where a > 0. Equation (4.24) would prove the evolution of {qi,j(k),Qi'(k)} is 
upper bounded. 

Define: 

q Wj {k) = m^{T{k)S^{k)fj{k)Pij-q^{k)^} (4.25) 

Q v {k) = maxj^^/.^A'j-QKfc)^! ^ 

Wj(k) = max{T(k) -Wj(k),0} . (4.27) 

Note that if qij(k) > r] m3jX and Qv{k) > i] max , then qij(k) = and Qv{k) = 0, 
respectively, and if Wj(k) > T max , then Wj(k) = 0. 

Using equations (4.17), (4.18), (4.19) (4.25), (4.26), and (4.27), the 
Lyapunov down drift equation (4.23) can be bounded as 

AV(k) < 2 \T(k) (vt,(k) - 8i,j(k)fj(k)Pi t j) + q^k)} %j {k) 



fj(k)T(k) J]Wy(*)flj - D i'j ) +Qi'( k ) 
i' i V i 

+2k {T{k) Pj - T{k)fj(k) + Wj(k)) wj(k) 
j 



Qv{k) 



n2 

max 
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because 1) q hl (k + 1) - qij(k) < 2r] m3uX , 2) Q v (k + 1) - Qy(k) < 2r] max , 3) 
Wj(k + 1) - Wj(k) < 2T max , and 4) 

qij(k + 1) - ftj(fc) < 

(vtm - s ld (k)f s (k)Pij) + &,■(*), 



Qv(k+1)-Q v (k) < 

fi(k)T(k) nr; l^yS^Pu - d v A + 

and Wj(k + 1) — i^(A;) < (pj — fj(k)) + Wj{k). 

Because of equations (4.25), (4.26) and (4.27), we have 

qiA k hA k ) < iax (4-28) 

Qv{k)Qv{k) < rj 2 max (4.29) 

w 3 {k)w 3 {k) < Tl x . (4.30) 

Let 

Mk) = E^(^#) ( 4 - 31 ) 

1,3 

and 



1,3 



+Y,fMQi'(k) |%-5]i w M*) p w ) • ( 4 - 32 ) 
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Let 

C = 2^r / L x + 2^r / L x + 2K^Tl x . (4.33) 

1,3 d t j 

Then, 

AV(k) < 2A(k)T(k) - 2B(k)T(k) + 2C 

-2k]Tt(£0(/#) -p>#) (4.34) 

j 

due to equations (4.28), (4.29), (4.30), (4.31), (4.32), and (4.33). 

Since (1 + e)x G A, there exist {yf'j}, {fj}, and {Sij} in T( 1+e ) x that 
satisfy the definition 3. Let 



hi 



+J2f^'^ \ d vj -EWuflj • ( 4 - 35 ) 

V l J 



Then, 



AV(k) < 2T(k)\A(k)-(B(k)-B(k) + B(k)) 

- K J2(fj(k)-Pj)wj(k) +2C. 

j 

To the RHS of the above inequality, we add and subtract the following two 
terms 1) 2K £\ f^bjT^k) 2) 2K £\ fjbjT(k) and add 2nT(k) £\ (f) - Pj ) Wj (k) 
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(> since fj satisfies inequality (4.3)). Thus, we get 

AV(k) < 2T(k)^A(k) - (B(k) - B(k)+B(k) 

j j I j 

j j 
(./} - /',) «•,(/••) +2C 

j 

< 2A(k)T(k) - 2B(k)T(k) + 2C 

-2KY,f j (k)b j T(k) + 2KY,f j b j T(k). (4.36) 
j j 
Inequality (4.36) holds because 

B{k) - Kj2fj(k)bj + - PjWk) 

j j 

> B(k) -KJ2 fjbj + « £ (■& " Pi) 
by algorithms (4.13) and (4.14). Hence, 

AV(A;) < 2A{k)T{k) - 2B(k)T(k) + 2C- 2KT(k) £ bj (jj(k) - f^j 



< 2T(k) 



1,3 



Y2C 



(4.37) 



< 2T{k) 



1,3 



1,3 

2C. 



(4.38) 
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Equation (4.37) follows because {fj} and {Sij} satisfy (4.5). Equation (4.3£ 
follows because {yf'A satisfies (4.4). Adding and subtracting 



1,3 

to RHS of equation (4.38), we have 

AV{k) < 2j2T(k)yf! j (k)(Ka l>j + q ljj (k)) + 2C 
id 

1,3 

1,3 

-2KT(k)Y,bj (f 3 (k) - f 3 ) . (4.39) 



Note that up until this point, we have only used the fact that (1 + e)x G A 
(and various substitutions) to arrive at the upper bound (4.39). Because of 
algorithm (4.12), we have 

(Kaij + qi s j(k)) = 

1,3 

Y\ xfT(k) min {Kaij + q u (k)} . (4.40) 
i 

Since = ( x + > 

(1 + e)Y^xf l T(k) min {ATa^ + %j {k)} 
i 

< Y,T{k)ytM k ) + K Y, a ^ k )tr ( 4 - 41 ) 

1,3 1,3 
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Thus, 

AV(k) < 2C-2Kj2T(k)a u (yf; j (k)-yt; j 

1,3 

—2e xf l T(k) min {qi Ak) + Kai ,} 
V j ' 

j 

Since a u < a max , < 6 max , T min < 1} < T max , and T(k)yf l j {k),T(k)y^ j < 
r] max , we have 

<4ax = 2C + 2/i ^r/ max + 2/i ^6 max T max 



> 2C-2Ji^T(A;)a ij (^.(fc) 
-2KT(k)J2bj(m)-fj 



~di 



and 



AV(fc) < g; ax -26^xf i T(A;)min{g u (A ; ) + ^ J } 



Let 

= min $xf l s.t. xf l > 

i.e., v is the smallest source rate greater than among all sources in the 
network . If there is / such that qij{k) > g max = g^ ax /(2eT min z/) for all j, then 
equation (4.24) holds. Thus, 

max » Vl,j,k. (4.42) 
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The control decision in equation (4.14) combined with the bound (4.42) 
prevents Qv{k) from being > g max + r/ max for any /', k. Thus, Qy{k) < g max + 
V/', k. 

Because of equation (4.42), A(k)T(k) < J2i,j Wfcax- By simply 
adding 2if £\ fj{k)bjT{k) > to eq. (4.34), we have 

AV(k) < 2A(k)T(k) - 2B(k)T{k) + 2C 

-2k T(k)U)(k) - Pj )wj(k) + 2KY, fi(k)bjT(k) 



If there is Wj(k) such that 
Wj{k) > 



/^jl j '/max^max ~\~ C ~\~ 2 Kb max Tj 



max 



k(1 -pj)T min 
then 

B(k)T(k) + (fj^-p^wjik) - Kj2fj(k)bjT(k) 

j j 

^ ^ ^ '/max?max ~\- C ~\~ li. ^ ^ ^max-^max 
1,3 j 

by algorithm (4.13), which implies that AV(fc) < 0. Thus, if 

w > Et,j W9max + C + if gj ^ma x T max 

ft (1 Pj) T m ; n 

for some j, fc, then AV(k) satisfies equation (4.24). Therefore, 

, . j ^max?max ~\~ C ~\~ if ■ ^maximax 
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Appendix B: Proof of Theorem 4 

Let C be as defined in eq. (4.33). Then, by eq. (4.39), we have 

AV(k) < 2C - 2Kj2T(k)a u (yf^k) - yf^ 

1,3 

j 

because of eqs. (4.40) and (4.41) and since E7&7" = x t ■ This implies that 



lim 1 Yl AV ^ 

<2C -2K lim 



- 2K ^ nim £ /A 

Since Qi'(t), Wj(k)} are bounded by Theorem 3, we have 

KmpEAV(*)=0, 

fc<fc' 

which implies that 

E fc < fc > (E y r(fc)»i4(fc) + Ej fs(k)T(k)b 3 ) 

lim = — 

<E a ^fi + E^ + C/^ (4-43) 
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In addition, we have 

1,3 3 

> maiji + T(k) fibs (4-44) 

1,3 3 

for each k. Eq. (4.44) holds because {fj,yf l j} is an optimal solution to eq. 
(4.9).Eqs. (4.44) and (4.43) show that a-ijUfj + J2j fj b j is < and > to 
RHS of eq. (4.20), respectively. ■ 
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Chapter 5 



Conclusion 



Currently exsiting communications algorithms depend on the dynamics 
of the network being either so fast that any fluctuations can be averaged out 
or slow enough to be tracked. However, if these algorithms are deployed on 
mobile communication networks, they fail to operate or are highly inefficient 
since the time-scale separation assumption on which these algorithms are built 
does not hold in mobile networks. The routing and rate control algorithms 
we have presented in this dissertation solve this problem by not relying on 
tracking or averaging out the network dynamics but on exploiting local queue 
information and the network's ability to dynamically adjust itself, all the while 
maintaining high efficiency and not sacrificing throughput. 

In Chapter 2, we proposed modifications to the TCP controller to adapt 
it to the hybrid downlink networks. The throughputs obtained via our modifi- 
cations were shown to be proportional to E [P] in the multi-path/multi- homing 
scenario, without the source tracking the channel quality information. 

In Chapter 3, we have presented back-pressure rate-control/routing al- 
gorithms adapted for intermittently connected networks. Our proposed al- 
gorithms solved the time-scale coupling of the traditional back-pressure al- 
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gorithm; namely, intermittent connectivity feeds back the wrong congestion 
signal to the inter-cluster source, making it believe that the connection is a 
low-rate link, or in order to have high inter-cluster rate, one has to maintain 
large queues at internal nodes that are no where near the intermittent links. 
We have verified that our algorithms work on a simple line network, and on a 
larger 16-node network. 

Lastly, in Chapter 4, we studied a network that uses a mobile carrier 
to transport data between stationary nodes. In our work, the mobile can 
change its movement dynamically to respond to the data traffic in the network. 
We have developed a cost minimization framework for such a network and 
developed a joint mobile-stationary algorithm that minimizes the sum cost, 
and demonstrated our algorithm on a wireless testbed. 

There are many potential research topics related to our work here on 
mobile communication networks. We highlight a few of them here. 

• Generic mobile network: The mobile transport network we have studied 
in Chapter 4 had the assumption that the mobiles return to one decision 
point to choose the next route to be on. Though useful in many scenarios, 
the network model is not generic enough to allow all types of mobile 
transport network imaginable. A possible future topic could be to extend 
the network model and design a distributed cost minimizing algorithm 
for the extended model. 

• Delay reduction: In Chapter 3, we have designed a delay reduction al- 
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gorithm for wireless networks using shadow packets. One disadvantage 
of shadow packets is that it is not energy efficient. In some networks 
and applications, minimum delay is more important than maximum 
throughput. Designing an energy efficient delay reduction algorithm us- 
ing shadow packets would be another possible topic. 

• Communication security: Improving security in wireless communication 
is an active area of research. Advancements in full-duplex radio tech- 
nologgy may enable secure wireless communication in cellular downlink 
networks we have studied in Chapter 2 or combined with mobile trans- 
port networks of Chapters 3 and 4 for military applications. 
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